BIG DATA

Data science interview questions:

Data can be classified into 3 types:

1024 bytes= 1 KB

The real challenge starts when we move to petabyte of data. 

Big Data has to do with petabyte and above.

Sources of Big Data: 

Computer generated data: minimal human interaction. Gaming servers, web servers. 

Hand generated data : Day to day activities like going to bank. Facebook, LinkedIn, Twitter. Everything is included the mouse click. 

Big data: Data coming in from all sources. It can be human generated or computer generated. 

New York stock exchange products 4-5TB of data per day. Internet stores 18.5 petabyte of data on a day. 7 billion cell phones store 60-80 gigabytes of data every second: text, call is analysed for some type of analysis.

Data from YouTube: 40-50 hours of video uploaded more than few terabyte per minute. 

Twitter: 12 terabyte tweet everyday. 

Yahoo: 60 petabyte per day.

eBay: 40 petabyte per day.