There are a total of 95 questions in the test, with 1 mark for each correct answer and no negative marking.

Before attempting any question, make sure to read it properly and understand what is being asked.

Make sure that you have a reliable internet connection and a quiet place to take the test.

All the Best for your Test!

Name

Email

Phone

For a classification task, instead of random weight initializations in a neural network, we set all the weights to zero. Which of the following statements is true?

There will not be any problem and the neural network will train properly

The neural network will train but all the neurons will end up recognizing the same thing

The neural network will not train as there is no net gradient change

None of these

What do you mean by a hard margin?

The SVM allows very low error in classification

The SVM allows high amount of error in classification

None of the above

Which of the given point is true about auto encoder?

It is an unsupervised ML algorithm similar to Principal Component Analysis

It minimizes the same objective function as Principal Component Analysis

The neural network’s target output is its input

All of above

Which of the following will run without errors?

round(45.8)

round(6352.898,2,5)

round()

round(7463.123,2,1)

Which of the following is an invalid variable?

my_string_1

1st_string

foo

_

Given a function that does not return any value, What value is thrown by default when executed in shell.

int

bool

void

None

In which neural net architecture, does weight sharing occur?

Convolutional neural Network

Recurrent Neural Network

Fully Connected Neural Network

Both A and B

What does 5 evaluate to?

+5

-11

+11

-5

What are the benefits of mini-batch gradient descent?

This is more efficient compared to stochastic gradient descent.

The generalization by finding the flat minima.

Mini-batches allow help to approximate the gradient of the entire training set which helps us to avoid local minima.

All of above

What is the volume of 32 metre high cylindrical tank?

1. The area of its base is 154 m2.

2. The diameter of the base is 14 m.

I alone sufficient while II alone not sufficient to answer

II alone sufficient while I alone not sufficient to answer

Either I or II alone sufficient to answer

Both I and II are not sufficient to answer

Both I and II are necessary to answer

Which of the following is incorrect?

a) float(‘inf’)

b) float(‘nan’)

c) float(’56’+’78’)

d) float(’12+34′)

Which one of the following has the same precedence level?

Addition and Subtraction

b) Multiplication, Division and Addition

c) Multiplication, Division, Addition and Subtraction

d) Addition and Multiplication

You are building a neural network where it gets input from the previous layer as well as from itself.

Which of the following architecture has feedback connections?

Recurrent Neural network

Convolutional Neural Network

Restricted Boltzmann Machine

None of these

Which of the following distance metric can not be used in k-NN?

Manhattan

Minkowski

Tanimoto

Jaccard

Mahalanobis

All can be used

The effectiveness of an SVM depends upon:

Selection of Kernel

Kernel Parameters

Soft Margin Parameter C

All of the above

How to select best hyperparameters in tree based models?

Measure performance over training data

Measure performance over validation data

Both of these

None of these

Which of the following sentence is FALSE regarding regression?

It relates inputs to outputs.

It is used for prediction.

It may be used for interpretation.

It discovers causal relationships

For an image recognition problem (recognizing a cat in a photo), which architecture of neural network would be better suited to solve the problem?

Multi Layer Perceptron

Convolutional Neural Network

Recurrent Neural network

Perceptron

Which of the following cross validation techniques is better suited for time series data?

k-Fold Cross Validation

Leave-one-out Cross Validation

Stratified Shuffle Split Cross Validation

Forward Chaining Cross Validation

How would you import a decision tree classifier in sklearn?

from sklearn.decision_tree import DecisionTreeClassifier

from sklearn.ensemble import DecisionTreeClassifier

from sklearn.tree import DecisionTreeClassifier

None of these

Import time str = '21/01/2017' datetime_value = time.strptime(str,date_format)

To convert the above string, what should be written in place of date_format?

"%d/%m/%y"

"%D/%M/%Y"

"%d/%M/%Y"

"%d/%m/%Y"

In a 300 m race A beats B by 22.5 m or 6 seconds. B's time over the course is:

86 sec

80 sec

76 sec

None of these

A, B, C rent a pasture. A puts 10 oxen for 7 months, B puts 12 oxen for 5 months and C puts 15 oxen for 3 months for grazing. If the rent of the pasture is Rs. 175, how much must C pay as his share of rent?

Rs. 45

Rs. 50

Rs. 55

Rs. 60

Which of the following algorithm are not an example of ensemble learning algorithm?

Random Forest

Adaboost

Extra Trees

Gradient Boosting

Decision Trees

Which gradient technique is more advantageous when the data is too big to handle in RAM simultaneously?

Full Batch Gradient Descent

Stochastic Gradient Descent

Mini Batch Gradient Descent

None of the above

If a = 0.1039, then the value of 4a2 - 4a + 1 + 3a is:

0.1039

0.2078

1.1039

2.1039

Which of the following is invalid?

_a = 1

__a = 1

__str__ = 1

none of the mentioned

Which of the following offsets, do we use in linear regressions least square line fit? Suppose horizontal axis is independent variable and vertical axis is dependent variable.

Vertical offset

Perpendicular offset

Both, depending on the situation

None of above

A, B and C enter into a partnership in the ratio 7/2:4/3:6/5 After 4 months, A increases his share 50%. If the total profit at the end of one year be Rs. 21,600, then B's share in the profit is:

Rs. 2100

Rs. 2400

Rs. 3600

Rs. 4000

Different learning methods does not include

Memorization

Analogy

Deduction

Introduction

Three times the first of three consecutive odd integers is 3 more than twice the third. The third integer is:

9

11

13

15

What is the output of type(45/3)

int

float

double

The least number which when divided by 5, 6 , 7 and 8 leaves a remainder 3, but when divided by 9 leaves no remainder, is:

1677

1683

2523

3363

Which of the following is an invalid statement?

abc = 1,000,000

a b c = 1000 2000 3000

a,b,c = 1000, 2000, 3000

a_b_c = 1,000,000

What would be the best value for random_state (Seed value ?

np.random.seed(1)

np.random.seed(40)

np.random.seed(32)

Can't say

Which of the following is not a keyword?

eval

assert

nonlocal

pass

Which of the following methods do we use to find the best fit line for data in Linear Regression?

Least Square Error

Maximum Likelihood

Logarithmic Loss

Both A and B

Adding more basis functions in a linear model... (pick the most probably option)

Decreases model bias

Decreases estimation bias

Decreases variance

Doesn't affect bias and variance

Which of the following is incorrect?

x = 0b101

x = 0x4f5

x = 19023

x = 03964

How much did Rohit get as profit at the year-end in the business done by Nitin, Rohit and Kunal?

I. Kunal invested Rs. 8000 for nine months, his profit was times that of Rohit's, and his investment was four times that of Nitin.

II. Nitin and Rohit invested for one year in the proportion 1: 2 respectively.

III. The three together got Rs. 1000 as profit at the year-end.

Only I and II

Only I and III

Question cannot be answered even with the information in all the three statements.

All I, II and III

None of these

A can run 22.5 m while B runs 25 m. In a kilometre race B beats A by:

100 m

1000/9 m

25 m

50 m

The expression Int(x) implies that the variable x is converted to integer.

a) True

b) False

What if we use a learning rate that’s too large?

Network will converge

Network will not converge

Can’t Say

All of above

Three numbers which are co-prime to each other are such that the product of the first two is 551 and that of the last two is 1073. The sum of the three numbers is:

75

81

85

89

In a regular week, there are 5 working days and for each day, the working hours are 8. A man gets Rs. 2.40 per hour for regular work and Rs. 3.20 per hours for overtime. If he earns Rs. 432 in 4 weeks, then how many hours does he work for ?

160

175

180

195

What percentage of numbers from 1 to 70 have 1 or 9 in the unit's digit?

1

14

20

21

The cost price of 20 articles is the same as the selling price of x articles. If the profit is 25%, then the value of x is:

15

16

18

25

What are the steps for using a gradient descent algorithm?

Calculate the error between the actual value and the predicted value

Reiterate until you find the best weights of the network

Pass an input through the network and get values from the output layer

Initialize random weight and bias

Go to each neuron that contributes to the error and change its respective values to reduce the error

1, 2, 3, 4, 5

5, 4, 3, 2, 1

3, 2, 1, 5, 4

4, 3, 1, 5, 2

What error occurs when you execute the following Python code snippet? apple = mango

SyntaxError

NameError

ValueError

TypeError b

Which of the following option is true about k-NN algorithm?

It can be used for classification

It can be used for regression

It can be used in both classification and regression

4 mat-weavers can weave 4 mats in 4 days. At the same rate, how many mats would be woven by 8 mat-weavers in 8 days?

4

8

12

16

Which of the following is not an example of a time series model?

Naive approach

Exponential smoothing

Moving Average

None of the above

Which of the following is not a complex number?

k = 2 + 3j

k = complex(2, 3)

k = 2 + 3l

k = 2 + 3J

“Convolutional Neural Networks can perform various types of transformation (rotations or scaling) in an input”.

Is the statement correct True or False?

True

False

In training a neural network, you notice that the loss does not decrease in the few starting epochs.

The reasons for this could be:

The learning rate is low

The regularization parameter is high

Stuck at local minima

What according to you are the probable reasons?

1 and 2

2 and 3

1 and 3

Any of these

When pooling layer is added in a convolutional neural network, translation in-variance is preserved.

True or False?

True

False

What is a dead unit in a neural network?

A unit which doesn’t update during training by any of its neighbour

A unit which does not respond completely to any of the training patterns

The unit which produces the biggest sum-squared error

None of these

What will be the output of print(print())

SyntexError

NameError

None

\nNone

Which of the following cannot be a variable?

__init__

in

it

on

What will be the output of the following Python code? 1.>>>str="hello" 2.>>>str[:-3] 3.>>>

he

lo

olleh

hello

Which is the correct operator for power(xy)?

X^y

X**y

X^^y

None of the mentioned

What data type is the L?

L = [('Ravi', 'Aman '),(1, 23, 'hello', 1)]

list

dictionary

array

tuple

What are the factors to select the depth of neural network?

Type of neural network (eg. MLP, CNN etc)

Input data

Computation power, i.e. Hardware capabilities and software capabilities

Learning Rate

The output function to map

1, 2, 4, 5

2, 3, 4, 5

1, 3, 4, 5

All of these

If you increase the number of hidden layers in a Multi Layer Perceptron, the classification error of test data always decreases.

True or False?

True

False

A and B can together finish a work 30 days. They worked together for 20 days and then B left. After another 20 days, A finished the remaining work. In how many days A alone can finish the work?

40

50

54

60

In random forest or gradient boosting algorithms, features can be of any type. For example, it can be a continuous feature or a categorical feature. Which of the following option is true when you consider these types of features?

Only Random forest algorithm handles real valued attributes by discretizing them

Only Gradient boosting algorithm handles real valued attributes by discretizing them

Both algorithms can handle real valued attributes by discretizing them

None of these

In a neural network, which of the following techniques is used to deal with overfitting?

A. Dropout

B. Regularization

C. Batch Normalization

D. All of these

Two dice are thrown simultaneously. What is the probability of getting two numbers whose product is even?

1/2

3/4

3/8

5/16

Which of the following is an example of a deterministic algorithm?

PCA

K-Means

None of the above

Two, trains, one from Howrah to Patna and the other from Patna to Howrah, start simultaneously. After they meet, the trains reach their destinations after 9 hours and 16 hours respectively. The ratio of their speeds is:

2 : 3

4 : 3

6 : 7

9 : 16

What do you mean by generalization error in terms of the SVM?

How far the hyperplane is from the support vectors

How accurately the SVM can predict outcomes for unseen data

The threshold amount of error in an SVM

What will be the output of the following Python expression?

bin(29)

‘0b10111’

‘0b11101’

‘0b11111’

‘0b11011’

Type of abc is ? if abc=100,000,10

object

tuple

List

Now let’s revise the previous slides. We have learned that:

A neural network is a (crude) mathematical representation of a brain, which consists of smaller components called neurons.

Each neuron has an input, a processing function, and an output.

These neurons are stacked together to form a network, which can be used to approximate any function.

To get the best possible neural network, we can use techniques like gradient descent to update our neural network model.

Given above is a description of a neural network. When does a neural network model become a deep learning model?

When you add more hidden layers and increase depth of neural network

When there is higher dimensionality of data

When the problem is an image recognition problem

None of these

In a 200 metres race A beats B by 35 m or 7 seconds. A's time over the course is:

40 sec

47 sec

33 sec

None of these

The number of neurons in the output layer should match the number of classes (Where the number of classes is greater than 2) in a supervised learning task.

True or False?

True

False

Two stations A and B are 110 km apart on a straight line. One train starts from A at 7 a.m. and travels towards B at 20 kmph. Another train starts from B at 8 a.m. and travels towards A at a speed of 25 kmph. At what time will they meet?

9 a.m.

10 a.m.

10.30 a.m.

11 a.m.

What does 4 evaluate to?

-5

-4

-3

+3

The neural network consists of many neurons, each neuron takes an input, processes it and gives an output. Here’s a diagrammatic representation of a real neuron.

Which of the following statement(s) correctly represents a real neuron?

A neuron has a single input and a single output only

A neuron has multiple inputs but a single output only

A neuron has a single input but multiple outputs

A neuron has multiple inputs and multiple outputs

All of the above statements are valid

K-fold cross-validation is

linear in K

quadratic in K

cubic in K

exponential in K

Suppose you have inputs as x, y, and z with values -2, 5, and -4 respectively. You have a neuron ‘q’ and neuron ‘f’ with functions: q = x + y , f = q * z

What is the gradient of F with respect to x, y, and z?

(HINT: To calculate the gradient, you must find (df/dx), (df/dy) and (df/dz))

(-3,4,4)

(4,4,3)

(-4,-4,3)

(3,-4,-4)

Which one of the following has the highest precedence in the expression?

Exponential

Addition

Multiplication

Parentheses

Which statement is true about NAG?

It calculates gradient from future position.

It calculates gradient from current position.

It calculates gradient from previous position.

None

Lets say, you are using activation function X in hidden layers of neural network. At a particular neuron for any given input, you get the output as -0.0001 Which of the following activation function could X represent?

ReLU

tanh

SIGMOID

None of these

What will be the output of the following Python code snippet if x=1? x<<2

8

1

2

4

A is 30% more efficient than B. How much time will they, working together, take to complete a job which A alone could have done in 23 days?

11 days

13 days

343/17 days

None of these

If I am using all features of my dataset and I achieve 100% accuracy on my training set, but ~70% on validation set, what should I look out for?

Underfitting

Nothing, the model is perfect

Overfitting

Which of the following are real world applications of the SVM?

Text and Hypertext Categorization

Image Classification

Clustering of News Articles

All of the above

What is the output of this expression, 3*1**3?

27

9

3

1

Consider a function which is defined below:

def fun(x): x[0] = 5 return x Now you define a list which has three numbers in it. g = [10,11,12] Which of the following will be the output of the given print statement: print fun(g), g

[5, 11, 12] [5, 11, 12]

[5, 11, 12] [10, 11, 12]

[10, 11, 12] [10, 11, 12]

[10, 11, 12] [5, 11, 12]

Which of the following statement is FALSE about ADAGRAD?

There is change in learning rate for each update of weights

It will decrease learning rate if weights are being updated in short amount of time

It will increase learning rate if weights are not being updated

Learning rate remains same it doesn’t modify

What are Hyperparameters are set before training?

Activation Function

Learning Rate

Momentum

All of above

Batch Normalization is helpful because

It normalizes (changes) all the input before sending it to the next layer

It returns back the normalized mean and standard deviation of weights

It is a very efficient backpropagation technique

None of these

What is the sequence of the following tasks in a perceptron?

Initialize weights of perceptron randomly

Go to the next batch of the dataset

If the prediction does not match the output, change the weights