Important questions / expected questions for Nov Dec 2016 IT6006 Data Analytics examinations conducting by Anna University Chennai

B.E./ B.Tech. DEGREE EXAMINATION Nov Dec 2016

07th Semester / Seventh Semester / IV Year

Department of IT

IT6006 Data Analytics

(Regulation 2013)

Nov Dec 2016 Important Questions

Important 16 Marks Questions with answers (All five units) are listed for IT6006 Data Analytics subject

1. i. What is Big data? Describe the main features of a data analytical system? (8)

ii. Describe in detail about the role of statistical models in Big data.(8)

2. i. Describe in detail about hypothesis testing? (8)

ii. Describe in detail about the probability distribution and entropy. (8)

3. How would you distinguish analysis and reporting tools used in Big-data?

4. Explain the concept of Friedman’s bias–variance decomposition for classifiers?

5. How would you compose the statistical concepts in inference?

6. Examine how would you implement regression modeling?

7. i. Define Principal component analysis (4)

ii. Describe in detail about cluster analysis and mixture decomposition (12)

8. i. Can you identify the different mechanisms needed for learning? (8)

ii. How do use the generalization techniques needed to illustrate neural networks? (8)

9. i. Explain briefly about Primal and dual representation in kernel perceptron. (8)

ii. Compare and contrast PCA and CCA. (8)

10. How would you formulate the ideas of search methods in stochastic data analysis?

11. Describe the Big Data Stream Analytics Framework (BDSAF) with a neat architecture diagram

12. How is data analysis used in stock market predictions?

13. i.What approaches would you use to estimate the moments? (8)

ii. Examine is the function cost of exact counts? (8)

14. i. How is sentiment analysis playing a major role in data mining? (8)

ii. What approaches would you use to make sentiment analysis?(8)

15. i. Can you assess the importance of sampling data in a stream? (10)

ii. Enlist the different stream sources. (6)

16. i. Define K-Means algorithm and how will you initialize the clusters and pick the value for K? (10)

ii. Examine how the data is processed in BFR Algorithm(6)

17. What approach would you use to handle large datasets in main memory?

18. i. What are the main features of GRGPF Algorithm? (6)

ii. How would initialize the cluster tree and add points in GRGPF Algorithm?(10)

19. Describe about Stream clustering and parallel clustering.

20. Evaluate the market basket data and its use in main memory

21. i. Highlight the features of Hadoop and explain the functionalities of Hadoop cluster? (8)

ii. Describe briefly about Hadoop input and output and write a note on data integrity? (8)

22. Summarize briefly on

i. Algorithms using MapReduce (8)

ii. Extensions to MapReduce (8)

23. Explain briefly on

i. MapR (6) ii. Shrading (6) iii. S3 (4)

24. Explain the complexity theory for Map-Reduce? What is reducer size and replication rate

25. Consider a collection of literature survey made by a researcher in the form of a text document with respect to cloud and big data analytics. Using Hadoop and Map Reduce , write a program to count the occurrence of pre dominant key words

B.E./ B.Tech. DEGREE EXAMINATION Nov Dec 2016

07th Semester / Seventh Semester / IV Year

Department of IT

IT6006 Data Analytics

(Regulation 2013)

Nov Dec 2016 Important Questions

Important 16 Marks Questions with answers (All five units) are listed for IT6006 Data Analytics subject

1. i. What is Big data? Describe the main features of a data analytical system? (8)

ii. Describe in detail about the role of statistical models in Big data.(8)

2. i. Describe in detail about hypothesis testing? (8)

ii. Describe in detail about the probability distribution and entropy. (8)

3. How would you distinguish analysis and reporting tools used in Big-data?

4. Explain the concept of Friedman’s bias–variance decomposition for classifiers?

5. How would you compose the statistical concepts in inference?

6. Examine how would you implement regression modeling?

7. i. Define Principal component analysis (4)

ii. Describe in detail about cluster analysis and mixture decomposition (12)

8. i. Can you identify the different mechanisms needed for learning? (8)

ii. How do use the generalization techniques needed to illustrate neural networks? (8)

9. i. Explain briefly about Primal and dual representation in kernel perceptron. (8)

ii. Compare and contrast PCA and CCA. (8)

10. How would you formulate the ideas of search methods in stochastic data analysis?

11. Describe the Big Data Stream Analytics Framework (BDSAF) with a neat architecture diagram

12. How is data analysis used in stock market predictions?

13. i.What approaches would you use to estimate the moments? (8)

ii. Examine is the function cost of exact counts? (8)

14. i. How is sentiment analysis playing a major role in data mining? (8)

ii. What approaches would you use to make sentiment analysis?(8)

15. i. Can you assess the importance of sampling data in a stream? (10)

ii. Enlist the different stream sources. (6)

16. i. Define K-Means algorithm and how will you initialize the clusters and pick the value for K? (10)

ii. Examine how the data is processed in BFR Algorithm(6)

17. What approach would you use to handle large datasets in main memory?

18. i. What are the main features of GRGPF Algorithm? (6)

ii. How would initialize the cluster tree and add points in GRGPF Algorithm?(10)

19. Describe about Stream clustering and parallel clustering.

20. Evaluate the market basket data and its use in main memory

21. i. Highlight the features of Hadoop and explain the functionalities of Hadoop cluster? (8)

ii. Describe briefly about Hadoop input and output and write a note on data integrity? (8)

22. Summarize briefly on

i. Algorithms using MapReduce (8)

ii. Extensions to MapReduce (8)

23. Explain briefly on

i. MapR (6) ii. Shrading (6) iii. S3 (4)

24. Explain the complexity theory for Map-Reduce? What is reducer size and replication rate

25. Consider a collection of literature survey made by a researcher in the form of a text document with respect to cloud and big data analytics. Using Hadoop and Map Reduce , write a program to count the occurrence of pre dominant key words

## 0 comments:

Pen down your valuable important comments below