Get the "Applied Data Science Edge"!

The ViralML School

Fundamental Market Analysis with Python - Find Your Own Answers On What Is Going on in the Financial Markets

Hot off the Press!

The Little Book of Fundamental Market Indicators

My New Book: "The Little Book of Fundamental Analysis: Hands-On Market Analysis with Python" is Out!

Grow Your Web Brand, Visibility & Traffic Organically

The Little Book of Fundamental Market Indicators

5 Years of amunategui.github.Io and the Lessons I Learned from Growing My Online Community from the Ground Up


Sign up for my newsletter and get my free intro class:
If you liked it, please share it:


5 Global Data Sources That Every Data Scientist Should Know About!
Here are 5 great, global, realtime or regularly curated data sources that every analyst, data scientist, programmer, whatever you call…

5 Global Data Sources That Every Data Scientist Should Know About!

Photo by NASA on Unsplash

Here are 5 great, global, realtime or regularly curated data sources that every analyst, data scientist, programmer, whatever you call yourself in this field, should know about!

I’ve covered each in walkthroughs or classes, so I’ll point them out if you want to get started really quickly.

Number 1 — OpenWeather

Get weather forecasts from around the world and much, much more. It’s free to sign-up and use as long as you don’t exceed the threshold of 60 calls per minute — perfect for an MVP/POC project!

There should be hundreds of potential applications popping up in your head about what you could do with this highly customizable worldwide weather forecasting API, right? Help your community, your neighboring communities, anybody needing weather data — here’s your chance.

https://openweathermap.org/

I covered this in one of my vlogs. I even show how to pull the weather icons for each forecast.

https://www.viralml.com/video-content.html?fm=yt&v=NDjAwbwQ868

Number 2 — GDELT

GDELT, The Global Database of Events, Language and Tone, is a true mammoth of a data source. It contains over 40 years of news and each year is about 2.5TB of data. Crazy big! 

The entire GDELT database is 100% free and open. You can download raw files or use their GDELT Analysis Service. Another option is Google Big Query, but be careful, keep an eye on the meter if you go that route. 

https://www.gdeltproject.org/

I did a brief vlog on GDELT and pulled a few events using Google Big Query. I also show how to run queries using time partitions that can drastically reduce the amount of data moved around and therefore cost.

https://www.viralml.com/video-content.html?v=yL4JZogjf8U

If you are looking to correlate the stock market or other time-series data to the news, this is it.

Number 3 — Global Health Observatory

Global Health Observatory (GHO) data from the World Health Organization (WHO) is a phenomenal global health statistics resource. It contains over 1000 indicators for 194 countries over a large period of time!

https://www.who.int/gho/en/

C’mon people, this is the perfect data set to build that health-awareness application you always wanted to build — the world needs you!

I create a free course using GHO’s world-wide life expectancy data. Simply enter your country, gender, and age, and it will tell you how many more years of life you can expect (on average). 

Free health-awareness web application class

Take the class here, it's free and use GHO data:

https://ml-entrepreneur.teachable.com/purchase?product_id=1546932

Number 4 —Realtime USGS Earthquake Hazards Data

The United States Geological Survey (USGS) Earthquake Hazards offers an incredible real-time data source reporting earthquakes from around the world and their Richter scale. 

It is free to use and very intuitive. Check out their Earthquakes map :
https://earthquake.usgs.gov/earthquakes/map//

If you’re interested in applying it to a web-based data science project, check out the Applied Machine Learning Track at the ViralML School where we use the USGS data to forecast earthquakes and plot the results on a Google Map — really cool!

You can find it at the ViralML School:

https://www.viralml.com/applied-machine-learning-tract.html

Number 5 — IEX Cloud Market Data

And last but certainly not least, the IEX Cloud service and it’s free stock market data. It is free to use as long as you keep it under their messaging threshold of 50k messages per month.

https://iexcloud.io/

I did a brief introduction to this service in one of my live webinars:
https://www.youtube.com/watch?v=4AHW7dfvvNk

And I also use it in the free class “Learn How to Create and Sell Your Machine Learning Products Online and For Free”.

Conclusion — No More Excuses

If you’re passionate about data science and want to make a difference and get noticed, these are for you. Let me know what you come up with!