sp19

Starting With the Basics, Regression

Read more

You always start with the basics, and with Data Science it's no different! We'll be getting our feet wet with some simple, but powerful, models and demonstrate their power by applying them to real world data.

A Few Useful Things to Know About Machine Learning, by Pedro Domingos"

Read more

Abstract: Machine learning algorithms can figure out how to perform important tasks by generalizing from examples. This is often feasible and cost-effective where manual programming is not. As more data becomes available, more ambitious problems can be tackled. As a result, machine learning is widely used in computer science and other fields. However, developing successful machine learning applications requires a substantial amount of "black art" that is hard to find in textbooks. This article summarizes twelve key lessons that machine learning researchers and practitioners have learned. These include pitfalls to avoid, important issues to focus on, and answers to common questions.

: Getting Started With Neural Networks

Read more

You've heard about them: Beating humans at all types of games, driving cars, and recommending your next Netflix series to watch, but what ARE neural networks? In this lecture, you'll actually learn step by step how neural networks function and how they learn. Then, you'll deploy one yourself!

Intro to Machine Learning Topics

Read more

Summary: This paper includes a brief introduction to and history of machine learning as well as breif summaries of topics in the field.

How Computers Can See and Other Ways Machines Can Think

Read more

Ever wonder how Facebook can tell you which friends to tag in your photos or how Google automatically makes collages and animations for you? This lecture is all about that: We'll teach you the basics of computer vision using convolutional neural networks so you can make your own algorithm to automatically analyze your visual data!

Deep Learning

Read more

Abstract: Deep learning allows for computational models that are composed of multiple processing layers to learn representations of data with multiple levels of abstraction. These methods have dramatically improved the state- of-the-art in speech recognition, visual object recognition, object detection and many other domains such as drug discovery and genomics. Deep learning discovers intricate structure in large data sets by using the backpropagation algorithm to indicate how a machine should change its internal parameters that are used to compute the representation in each layer from the representation in the previous layer. Deep convolutional nets have brought about breakthroughs in processing images, video, speech and audio, whereas recurrent nets have shone light on sequential data such as text and speech.

Who Needs Show Writers Nowadays?

Read more

This lecture is all about Recurrent Neural Networks. These are networks with with added memory, which means they can learn from sequential data such as speech, text, videos, and more. Different types of RNNs and strategies for building them will also be covered. The project will be building a LSTM-RNN to generate new original scripts for the TV series “The Simpsons”. Come and find out if our networks can become better writers for the show!

Handwritten Digit Recognition With a Back-Propagation Network

Read more

Abstract: We present an application of back-propagation networks to hand-written digit recognition. Minimal preprocessing of the data was required, but architecture of the network was highly constrained and specifically designed for the task. The input of the network consists of normalized images of isolated digits. The method has 1% error rate and about a 9% reject rate on zipcode digits provided by the US Postal Service.

What Makes Deep Learning More of an Art Than a Science?

Read more

Some of the hardest aspects of Machine Learning are the details. Almost every algorithm we use is sensitive to "hyperparameters" which affect the initialization, optimization speed, and even the possibility of becoming accurate. We'll cover the general heuristics you can use to figure out what hyperparameters to use, how to find the optimal ones, what you can do to make models more resilient, and the like. This workshop will be pretty "down-in-the-weeds" but will give you a better intuition about Machine Learning and its shortcomings.

: Sparse Autoencoders

Read more

Summary: These notes describe the sparse autoencoder learning algorithm, which is one approach to automatically learn features from unlabeled data. In some domains, such as computer vision, this approach is not by itself competitive with the best hand-engineered features, but the features it can learn do turn out to be useful for a range of problems (including ones in audio, text, etc).

Cleaning and Manipulation a Dataset With Python

Read more

In the fields of Data Science and Artificial Intelligence, your models and analyses will only be as good as the data behind them. Unfortunately, you will find that the majority of datasets you encounter will be filled with missing, malformed, or erroneous data. Thankfully, Python provides a number of handy libraries to help you clean and manipulate your data into a usable state. In today's lecture, we will leverage these Python libraries to turn a messy dataset into a gold mine of value!

A Critical Review of Recurrent Neural Networks for Sequence Learning

Read more

Abstract: Countless learning tasks require dealing with sequential data. Image captioning, speech synthesis, and music generation all require that a model produce outputs that are sequences. In other domains, such as time series prediction, video analysis, and musical information retrieval, a model must learn from inputs that are sequences. Interactive tasks, such as translating natural language, engaging in dialogue, and controlling a robot, often demand both capabilities. Recurrent neural networks (RNNs) are connectionist models that capture the dynamics of sequences via cycles in the network of nodes. Unlike standard feedforward neural networks, recurrent networks retain a state that can represent information from an arbitrarily long context window. Although recurrent neural networks have traditionally been difficult to train, and often contain millions of parameters, recent advances in network architectures, optimization techniques, and parallel computation have enabled successful large-scale learning with them. In recent years, systems based on long short-term memory (LSTM) and bidirectional (BRNN) architectures have demonstrated ground-breaking performance on tasks as varied as image captioning, language translation, and handwriting recognition. In this survey, we review and synthesize the research that over the past three decades first yielded and then made practical these powerful learning models. When appropriate, we reconcile conflicting notation and nomenclature. Our goal is to provide a selfcontained explication of the state of the art together with a historical perspective and references to primary research.

A Walk Through the Random Forest

Read more

Neural Nets are not the end all be all of Machine Learning. In this lecture, we will see how a decision tree works, and see how powerful a collection of them can be. From there, we will see how to utilize Random Forests to do digit recognition.

Deep Visual-Semantic Alignments for Generating Image Descriptions

Read more

Abstract: We present a model that generates natural language descriptions of images and their regions. Our approach leverages datasets of images and their sentence descriptions tolearn about the inter-modal correspondences between language and visual data. Our alignment model is based on a novel combination of Convolutional Neural Networks over image regions, bidirectional Recurrent Neural Networks over sentences, and a structured objective that aligns the two modalities through a multimodal embedding. We then describe a Multimodal Recurrent Neural Network architecture that uses the inferred alignments to learn to generate novel descriptions of image regions. We demonstrate that our alignment model produces state of the art results in retrieval experiments on Flickr8K, Flickr30K and MSCOCOdatasets. We then show that the generated descriptions sig outperform retrieval baselines on both full images and on a new dataset of region-level annotations.

Generative Adversarial Networks

Read more

Abstract: We propose a new framework for estimating generative models via an adversarial process, in which we simultaneously train two models: a generative model G that captures the data distribution, and a discriminative model D that estimates the probability that a sample came from the training data rather than G. The training procedure for G is to maximize the probability of D making a mistake. This framework corresponds to a minimax two-player game. In the space of arbitrary functions G and D, a unique solution exists, with G recovering the training data distribution and D equal to 1/2 everywhere. In the case where G and D are defined by multilayer perceptrons, the entire system can be trained with backpropagation. There is no need for any Markov chains or unrolled approximate inference networks during either training or generation of samples. Experiments demonstrate the potential of the framework through

Support Vector Machines

Read more

Support Vector Machines were among the most highly used ML algorithms before Neural Nets came back into the foreground. Unlike Neural Nets, SVMs can explain themselves quite well and allow us to use these ML mdels in fields like medicine, finance, and the like – where regulations require that we can inquire about our models.

Making Sense of High-Dimensional Data

Read more

We're filling this out!

Machine Learning Applications

Read more

We're filling this out!