Some great sources of data

posted Jan 22, 2014, 11:21 AM by Jen Mankoff "The home of the U.S. Government’s open data. Here you will find data, tools, and resources to conduct research, develop web and mobile applications, design data visualizations, and more."

Kaggle's competitions give you access to data and a goal

Lots of folks have begun uploading data to sites such as Google Fusion Tables (where you can search for tables on different topics) and ManyEyes which provide online tools for data exploration that require little or no programming. 

Websites such as DataMob and KDNuggets collect data sets and innovative tools for data mining and wrangling

A google search for 'data sets for machine learning' turned up numerous repositories including;;; UCI's machine learning repository

If you want big data, you may want to look at the public data sets hosted on amazon's web servers (AWS). Some further ideas can be found in this Quora discussion and in this list at 'hadoopunlimited'

Have fun!