Sources of Data
- Relational databases
- Flat files and XML datasets
- APIs and web services
- Web scraping
- Data streams and feeds
1. Relational databases
- Examples: business activities, customer transactions, human resource activities, workflows
- Applications: SQL server, ORACLE, MySQL, IBM DB2
2. Flat files and XML datasets
- Flat files:
- store data in plain text format
- each line, or row, is one record
- each value is separated by a delimiter
- all of the data in the a file maps to a single table
3. APIs and web services
- Examples: twitter and facebook APIs, stock market APIs, data lookup and validation APIs
4. Web scraping
- Examples: collecting training and testing datasets for machine learning models…
- Applications: BeautiofulSoup, Scrapy, Pandas, Selenium…
5. Data streams and feeds
- Examples: Sensor data feeds for monitoring industial or farmoing machinery…
Data repository:
Data lake/ Data warehouse
Big data stores