AI News, Meetup: No-Bullshit Data Science

Meetup: No-Bullshit Data Science

In this talk I will apply the much needed methods of critical thinking and quantitative measurements (that data scientists are supposed to use daily in solving problems for their companies) to assess the capabilities of the most widely used software tools for data science.

I will discuss in details two such analyses, one concerning the size of datasets used for analytics and the other one regarding the performance of machine learning software used for supervised learning.

About a decade ago he moved to Santa Monica, California to become the Chief Scientist of a credit card processing company doing everything data (ETL, analysis, modeling, visualization, machine learning etc).

Interview: Szilard Pafka, Data Scientist

The Los Angeles data science Meetup scene is booming in large part due to the efforts of a local data scientist, Szilard Pafka.

I have been using data science (data analysis, data visualization, modeling/machine learning, etc.) mostly in this domain (for example for detecting credit card fraud, measuring/monitoring performance etc.) for quite a while (well before “data science” became a term used to describe this/became “cool”).

I’m also not the typical extrovert community person, so in 2009 when I started the LA R meetup (which in retrospect was the very first Data Science meetup in Los Angeles) I did mainly so that I can deepen my knowledge of R by interacting regularly with those few who were using R at the time.

Not so long ago I realized that I had been spending a lot of time doing data visualization or thinking about how all these data tools and knowledge are put together in different companies, and here it goes, the DataVis LA and the Data Science LA meetups were born (the Data Science meetup is using the Machine Learning group’s meetup infrastructure).

A survey of tools used by data scientists (it turns out it’s mainly R and Python) have led to the creation of the Python Data Science meetup, while the overlap with traditional business analytics (data warehouse, BI etc.) to the launch of the DW/BI meetup (the latter two with fellow co-organizer Eduardo Arino de la Rubia).

In fact with 5 meetup groups now, I would still like to focus primarily on quality and not quantity, so even that there are more meetup groups (and others getting involved in organizing), I don’t think the total number of events we’ll be putting together will increase dramatically.

It also looks like tech meetups in general have started to play a larger and larger role not only in general networking, but also in companies using it as a primary avenue for hiring.

slides, code, pictures, sometimes video recording) of the above mentioned meetups, it provides a venue for local data scientists to publish blog posts on various topics, and we also have other interesting content, please check it out.

Like many people in Physics at that time (late 90s) I was working with data, models, computational approaches, and I ended up working in risk management in a bank while still working on my PhD research involving statistical modeling of financial prices.

In 2006 I came to California to be the Chief Scientist for Epoch, essentially doing data science (data analysis, modeling/machine learning, data visualization etc.) way before the “data science” term has been used to describe this.

In 2009 I started organizing an R meetup in Los Angeles (which retrospectively was the very first data science meetup in LA) with the goal of bringing together data professionals to learn from each other.

More recently I started other meetup groups that are focused on my other professional interests (DataVis, Data Science), and finally a few weeks ago with the involvement of a couple of other volunteers, we started datascience.la, a website serving the growing LA data science community.

Later on I got involved in data, modeling and computing, my Monte Carlo simulations in the field of materials science (dislocation systems) generated lots of data that needed to be analyzed, I think that's how I started more seriously to use tools for data munging/analysis/visualization.

We also have now open source tools to tackle larger and larger datasets (Hadoop, but more excitingly for data scientists tools that support interactive analysis such as Impala or Spark).

With the raise of “data science” as a term for our essentially old craft, we started to have events on more general topics and ultimately I started new meetup groups to focus on specific parts of data science (DataVis) or the overall process of combining tools in various companies.

DataScience.LA takes this to a new level, by preserving the content of the meetups (slides, code, video recording etc.) and involving the community in new ways (such as blogging).

It is also a way to scale up the community leadership by involving top-notch data scientists from LA in serving the needs of the growing data science community.

Don't get discouraged by low attendance, we had about 30 people attending the R meetup in the first 2 years (well, the number of R users in general exploded only after that).

- Epoch is an online credit card transaction processor, so obviously the main problem is fraud detection, but there are many other areas for example in sales tracking, marketing or consumer satisfaction that can be improved by models or insights from data.

- Unfortunately I'm not allowed to share details about Epoch, but my general philosophy is to start with a business problem a company needs to solve (usually improving the bottom line), understand the domain, look at the data and come up with solutions that are best suited for the problem – the outcome can be an advice for an action or a model that can be deployed.

Ideally, data scientists learn the domain knowledge in the various parts of the organization, explore the data, give strategic advice, develop models that can operationalize micro-decisions, but they also disseminate a data-centric view across the organization and mentor key personnel in other departments so that they can use increasingly data and results of data analysis and models in their day-to-day job.

- Let me quote Niels Bohr: “Prediction is very difficult, especially if it's about the future.” Data science is good at predicting micro-events where we have data about lots of past micro-events, we fit a distribution (implicitly most of the time e.g.

Pafka Szilard: R Stories from the Trenches

This talk was presented at the Budapest Users of R Network on Aug 26 2015 with two parts: • In the first part I will reveal why I switched to R for most of my data ...

Brian Skerry reveals ocean's glory -- and horror

Photographer Brian Skerry shoots life above and below the waves -- as he puts it, both the horror and the magic of the ocean. Sharing ..

Jim Eckles' Interview

Jim Eckles has worked for decades for the White Sands Missile Range Public Affairs Office, managing open houses and tours of the Trinity site, where the ...

Suspense: The 13th Sound / Always Room at the Top / Three Faces at Midnight

The program's heyday was in the early 1950s, when radio actor, producer and director Elliott Lewis took over (still during the Wilcox/Autolite run). Here the ...

Words at War: Lifeline / Lend Lease Weapon for Victory / The Navy Hunts the CGR 3070

The United States Merchant Marine is the fleet of U.S. civilian-owned merchant vessels, operated by either the government or the private sector, that engage in ...

Dragnet: Eric Kelby / Sullivan Kidnapping: The Wolf / James Vickers

Dragnet is a radio and television crime drama about the cases of a dedicated Los Angeles police detective, Sergeant Joe Friday, and his partners. The show ...

Calling All Cars: Body on the Promenade Deck / The Missing Guns / The Man with Iron Pipes

The radio show Calling All Cars hired LAPD radio dispacher Jesse Rosenquist to be the voice of the dispatcher. Rosenquist was already famous because home ...

Calling All Cars: The Blood-Stained Coin / The Phantom Radio / Rhythm of the Wheels

The radio show Calling All Cars hired LAPD radio dispacher Jesse Rosenquist to be the voice of the dispatcher. Rosenquist was already famous because home ...

Suspense: The Bride Vanishes / Till Death Do Us Part / Two Sharp Knives

Together with the Authorized version and the works of Shakespeare, the Book of Common Prayer has been one of the three fundamental underpinnings of ...

Suspense: Loves Lovely Counterfeit

Common methods and themes in crime thrillers are mainly ransoms, captivities, heists, revenge, kidnappings. More common in mystery thrillers are ...