Big Data Hadoop Expert Quiz

Big Data Quiz

Big Data Quiz : This Big Data Expert Hadoop Quiz contains set of 60 Big Data Quiz which will help to clear any exam which is designed for Expert.



1) What is identity mapper in hadoop.

  1. When same mapper runs on the different dataset in same job.
  2. When same mapper runs on different dataset in different jobs.
  3. When there is no mapper for a job then identity mapper class used.
  4. Both b and c are correct.

Answer : c

 
 
2) Default scheduler used in map-reducer framework

  1. Capacity scheduler.
  2. Fair scheduler.
  3. Job scheduler.
  4. Lazy scheduler.

Answer : b

 
 
3) You are executing PIG in MapReduce mode and you want to execute in local mode. How can you achieve it?

  1. Pig –x local
  2. Pig –x mapreduce leave
  3. Both are correct
  4. None of these

Answer : a

 
 
4) The process in hadoop-2.0 which scale the name node horizontally is known as

  1. Federation
  2. Fenching
  3. Namenode HA
  4. None of these.

Answer : a

 
 
5) What is true about partitioner in Map-reduce job?

  1. No. of partition is same as number of reducer.
  2. Default partition in hive is Hash partition.
  3. All of these.
  4. None of these.

Answer : c

 
 
6) Which is not a role of reporter in map-reduce program.

  1. Report about the progress of mappers and reducers.
  2. Set application level status message.
  3. Update counters.
  4. Help in lunching failed job with the help of counter.

Answer : d

 
 
7) What is mapred.tip.id refers to in the time of debugging?

  1. Id of the mapper currently running.
  2. Id of the reducer currently running.
  3. Id of the task currently running.
  4. Id of the job currently running.

Answer : c

 
 
8) Which below scheduler support multiple queues?

  1. Fair scheduler
  2. Capacity scheduler.
  3. Lazy scheduler.
  4. Default scheduler.

Answer : b

 
 
9) How can you set a debug script in hadoop MR Job?

  1. JobConf.setMapDebugScript(String)
  2. JobConf.setReducerDebugScript(String)
  3. JobConf.setJobScript(String)
  4. None of these.

Answer : a

 
 
10) What is the default data type in PIG

  1. Bytearray
  2. Chararray
  3. Textarray
  4. None

Answer : a




 
11) Which operator in PIG is not associated with loading and storing?

  1. Load
  2. Store
  3. Dump
  4. Split

Answer : d

 
 
12) How can you enable the MemStore-Local Allocation Buffer, a feature which works to prevent heap fragmentation under heavy write loads?

  1. hbase.hregion.memstore.mslab.enabled
  2. hbase.hregion.memstore.enabled
  3. hbase.hregion.memstore.mslab.job.enabled
  4. none

Answer : a

 
 
13) What is the hashing algorithm used for hash function.

  1. Murmur
  2. Lazy
  3. Default
  4. None.

Answer : b

 
 
14) Which one is the policy configuration file used by RPC servers to make authorization decisions on client requests.

  1. hadoop.policy.file= hbase-policy.xml
  2. hadoop.policy.file.apache= hbase-policy.xml
  3. hadoop.policy.file.enable= hbase-policy.xml
  4. none

Answer : a

 
 
15) how can you specify destination directory in sqoop

  1. –target-dir <dir>
  2. –destination-dir <dir>
  3. –hdfs-dir <dir>
  4. All of the above

Answer : a

 
 
16) How can you enable compression in sqoop.

  1. -z
  2. –compress
  3. Both a and b
  4. None.

Answer : c

 
 
17) Let say you are importing from PostgreSQL through sqoop in conjunction with direct mode, you can split the import into separate files after individual files reach a certain size. How you can do it?

  1. –direct-split-size
  2. –input-split-size
  3. –postgresql-split-size
  4. None

Answer : a

 
 
18) What is the default port to access name node web UI?

  1. 50060
  2. 50050
  3. 50070
  4. None of the above

Answer : c

 
 
19) How many states does Writable interface defines _____.

  1. Two
  2. Three
  3. Six
  4. None of the above

Answer : a

 
 
20) What are supported programming languages for Map Reduce?

  1. The most common programming language is Java, but scripting languages are also supported via Hadoop streaming.
  2. Any programming language that can comply with Map Reduce concept can be supported.
  3. Only Java supported since Hadoop was written in Java.
  4. Currently Map Reduce supports Java, C, C++ and COBOL.

Answer : a




 
21) What is map – side join?

  1. Map-side join is a technique in which data is eliminated at the map step
  2. Map-side join is done in the map phase and done in memory
  3. Map-side join is a form of map-reduce API which joins data from different locations
  4. None of these answers are correct

Answer : B

 
 
22) What is reduce – side join?

  1. Reduce-side join is a technique to eliminate data from initial data set at reduce step
  2. Reduce-side join is a technique for merging data from different sources based on a specific key. There are no memory restrictions
  3. Reduce-side join is a set of API to merge data from different sources.
  4. None of these answers are correct

Answer : B

 
 
23) What is AVRO?

  1. Avro is a java serialization library
  2. Avro is a java compression library
  3. Avro is a java library that create splittable files
  4. None of these answers are correct

Answer : A

 
 
24) Can you run Map – Reduce jobs directly on Avro data?

  1. Yes, Avro was specifically designed for data processing via Map-Reduce
  2. Yes, but additional extensive coding is required
  3. No, Avro was specifically designed for data storage only
  4. Avro specifies metadata that allows easier data access. This data cannot be used as part of map-reduce execution, rather input specification only.

Answer : A

 
 
25) Can a custom type for data Map-Reduce processing be implemented?

  1. No, Hadoop does not provide techniques for custom datatypes.
  2. Yes, but only for mappers.
  3. Yes, custom data types can be implemented as long as they implement writable interface.
  4. Yes, but only for reducers.

Answer : C

 
 
26) Which Node acts as an access point for the external applications, tools, and users
that need to utilize the Hadoop environment.

  1. Datanode
  2. namenode
  3. job tracker
  4. N/A

Answer : b

 
 
27) Which object can be used to get the progress of a particular job

  1. MAP
  2. Reducer
  3. Context
  4. Prgress

Answer : c

 
 
28) Which node performs housekeeping functions for the NameNode.

  1. Datanode
  2. namenode
  3. Secondary NameNode
  4. Edge Node

Answer : c

 
 
29) Which of the following utilities allows you to create and run MapReduce jobs with any executable or script as the mapper and/or the reducer?

  1. Oozie
  2. Sqoop
  3. Sqoop
  4. Hadoop Streaming

Answer : d

 
 
30) Which MapReduce stage serves as a barrier, where all previous stages must be completed before it may proceed?

  1. Combine
  2. Group
  3. Reduce
  4. Write

Answer : a




 
31) Which TACC resource has support for Hadoop MapReduce?

  1. Ranger
  2. Longhorn
  3. Lonestar
  4. Spur

Answer : a

 
 
32) What is the implementation language of the Hadoop MapReduce framework?

  1. Java
  2. C
  3. FORTRAN
  4. Python

Answer : a

 
 
33) Which MapReduce phase is theoretically able to utilize features of the underlying file system in order to optimize parallel execution?

  1. Split
  2. Map
  3. Combine
  4. None of the above

Answer : a

 
 
34) _______ shell is used to execute Pig Latin statement

  1. Execute
  2. Run
  3. Grunt
  4. N/A

Answer : c

 
 
35) _______ operator is used to view logical, physical and mapreduce execution plan to compute a relation

  1. Show
  2. Describe
  3. Display
  4. Explain

Answer : d

 
 
36) Pig is developed by

  1. Face Book
  2. Yahoo
  3. Twitter
  4. Linked In

Answer : b

 
 
37) Pig is ?

  1. Declarative language
  2. Data Flow Language
  3. Both
  4. N/A

Answer : b

 
 
38) Following is not a Daemon of YARN

  1. Resource Manager
  2. Node Manager
  3. Application Master
  4. Job TRacker

Answer : d

 
 
39) What happens when a Map task crashes while running a MapReduce job on a cluster configured with MapReduce version 1 (MRv1)?

  1. The framework closes the JVM instance and restarts
  2. The job immediately fails
  3. The JobTracker attempts to re-run the task on the same node
  4. The JobTracker attempts to re-run the task on a different node

Answer : c

 
 
40) Which daemon reports available slots for scheduling a Map or Reduce operation in MapReduce version 1 (MRv1)?

  1. TaskTracker
  2. JobTracker
  3. Secondary NameNode
  4. DataNode

Answer : a




 
41) How is the number of Mappers determined for a job in a MapReduce?

  1. The number of Mappers is calculated by the NameNode based on the number of HDFS blocks in the files.
  2. The developer specifies the number in the job configuration.
  3. The JobTracker chooses the number based on the number of available nodes.
  4. The number of Mappers is equal to the number of InputSplits calculated by the client submitting the job

Answer : d

 
 
42) Which daemon instantiates Java Virtual Machines in a cluster running MapReduce v1 (MRv1)

  1. ResourceManager
  2. TaskTracker
  3. JobTracker
  4. DataNode

Answer : b

 
 
43) Number of files the reduce task generate?

  1. one file all together
  2. one file per reducer
  3. Depends on input file
  4. None

Answer : b

 
 
44) If the Name node is down and job is submitted

  1. It will connect with Secondary name node to process the job
  2. Wait untill Name Node comes up
  3. gets files from local disk
  4. Job will fail

Answer : d

 
 
45) What is Partitioning in MapReduce

  1. Making map output into equal partions
  2. when map output exceeds the limit, create a new one
  3. Assigning map output keys to reducers
  4. None of the above

Answer : c

 
 
46) The Reducer class defines

  1. How to process one key at a time
  2. How to process multiple keys together
  3. Depends on the logic any thing can be done
  4. Depends on the number of keys

Answer : a

 
 
47) By Default number of Map tasks depends upon

  1. number of machines
  2. Number of files
  3. configurable
  4. number of splits

Answer : d

 
 
48) How dose MapReduce hides data dispersion

  1. Through Map Reduce components
  2. By defining data as keys and values
  3. By clustering machines
  4. HDFS takes care of it

Answer : b

 
 
49) What is the input format for Hadoop Archive files

  1. TextInputFormat
  2. SequenceFileInputFormat
  3. None of these
  4. There is no suitable Input Format Type

Answer : d

 
 
50) map() method uses the following object to send output to Map/Reduce framework

  1. JobClient
  2. Config
  3. Context
  4. It can directly write

Answer : c




 
51) Number of Partition is equals to

  1. Number of Reducers
  2. Number of Mappers
  3. Number of Input Split
  4. Number of output directories

Answer : a

 
 
52) A combiner class can be created by extending

  1. Combiner Class
  2. Mapper class
  3. Reducer Class
  4. Partitioner Class

Answer : c

 
 
53) Distributed cache can be used to add

  1. a data file
  2. a jar file library
  3. both 1 and 2
  4. None of the above

Answer : c

 
 
54) For making Custom Partitioning one needs to implement

  1. Logic to be written in Mapper
  2. Logic to be written in Reducer
  3. Partitioner
  4. Combiner

Answer : c

 
 
55) _______________ file controls debugging metrics in hadoop

  1. .core-site.xml
  2. properties
  3. .hadoop-env.sh
  4. hadoop-metrics.properties

Answer : d

 
 
56) Default input key type for TextInputFormat?

  1. LongWritable
  2. ShortWrtiable
  3. NullWritable
  4. Text

Answer : a

 
 
57) Output of reducer is written to

  1. temp directory
  2. HDFS
  3. Local disk
  4. None of the above

Answer : b

 
 
58) How to specify UNIX time in milliseconds in Flume.

  1. %u
  2. %b
  3. %t
  4. None

Answer : c

 
 
59) How to specify long month name (January, February) in flume.\

  1. %b
  2. %B
  3. %M
  4. %l

Answer : b