Sam Eldin
CV - Resume Tools 4 Sharing Architects 2 Show Big Data Presentation Android Training Java-Unix Code Templates Interviews QA & Code

Big Data
For anyone to imagine what Big Data is?
Let us look at my family, house hold, businesses and see what personal data (credit cards, credit history, what I own, places, travel, habits, phone calls, emails, web sites, ...etc) is already collected and for sale regardless of all the privacy laws we have. Not to mentioned, I am an IT professional who is conscious of what is done behind the scenes. Sadly, I cannot even slow any of the companies from collecting my data. Social Media is doing a great job selling data about everything we do.

There are 329.45 million (August 2019) citizens in USA and data collected about the population is beyond comprehension. The size data is measured or the volumes of data is typically terabytes (1,000 power 4) or petabytes (1,000 power 5) and growing. So data is Big and getting bigger.

How to:
Big Data

   1. Collecting 2. Storing 3. Securing 4. Certifying
   5. Cleaning 6. Analyzing 7. Cleaning and pruning 8. Extracting, Transforming and Loading
   9. Structuring - Business Model 10. Mining 11. Relation and Correlation 12. Finding trends, correlations and patterns
   13. Creating data sets 14. Profiling 15. Personalization 16. Predictive
   17. Customization 18. Segmentation 19. Forecasting 20. Creating reports
   21. Virtualization 22. Sales Support 23. Market research 24. Compression
   25. Maintaining and updating 26. Conversion 27. Encryption 28. Data streaming

Data Farming
Farming: Data Farming is the process of using a high performance computer or computing grid to run a simulation thousands or millions of times across a large parameter and value space. The result of Data Farming is a landscape of output that can be analyzed for trends, anomalies, and insights in multiple parameter dimensions.

Our Approach
Looking at the big picture, we can state that there are things we cannot improve and things we can improve or create shortcuts or faster approaches.

What are the things we cannot improve?
We cannot do anything about:

         Collecting, Storing, Securing, Certifying and Cleaning.

These tasks are must do and no short cuts.

What are the things we can improve?
Each data item regardless of what it is or where it comes from has a value. Creating shortcuts for these values would reduce that data size and increase processing. For example, person's age can be presented in the following forms:

         • Twenty nine years old (20 bytes or more)
         • 29 (two digits or two bytes or thee digits if over 99)
         • One byte (0 to 255) = 00011101

Position of each bit in one byte 8 7 6 5 4 3 2 1
Value of each position 128 64 32 16 8 4 2 1
Total value = 29 0 0 0 1 1 1 0 1

We can improve data processing both vertically and horizontally:

Build shorter and faster format for processing

Build intelligent data sets and data structure

The following sites are our presentation of handling CRM and Big Data: