MESSAGE
DATE | 2022-03-05 |
FROM | Ruben Safir
|
SUBJECT | Subject: [Hangout - NYLXS] AI AND THE INTERNET
|
https://thenewstack.io/context-apache-spark-for-artificial-intelligence-and-ai-2-0/
Context: Apache Spark for Artificial Intelligence and AI 2.0 19 Apr 2019 3:00pm, by Libby Clark  KubeCon + CloudNativeCon and InfluxData sponsored this podcast.
 Apache Spark for Artificial Intelligence and AI 2.0
Listen to all TNS podcasts on Simplecast.
Today on The New Stack Context we talk with Garima Kapoor, COO and co-founder of MinIO, about using Spark at scale for Artificial Intelligence and Machine Learning (AI/ML) workloads on Kubernetes.
The Apache and Hadoop ecosystem hasn’t had much overlap with Kubernetes in the past, but as we learned at KubeCon in Seattle last November, that is quickly changing. As Iguazio’s Yaron Haviv wrote in a contributed article on TNS titled “Will Kubernetes Sink the Hadoop Ship?”
Sponsor Note  InfluxData delivers a complete open-source platform built specifically for metrics, events, and other time- based data — a modern time-series platform. Whether the data comes from humans, sensors, or machines, InfluxData empowers developers to build next-generation monitoring, analytics, and IoT applications faster, easier, and to scale delivering real business value quickly. “Early adopters are realizing that they can run their big data stack (Spark, Presto, Kafka, etc.) on Kubernetes in a much simpler manner. Furthermore, they can run all of the cool post-Hadoop AI and data science tools like Jupyter, TensorFlow, PyTorch or custom Docker containers on the same cluster.”
Fast forward to now and we are approaching the Spark + AI Summit which Databricks is putting on in San Francisco next week and we are curious… How is Spark being used in cloud native architectures these days, with the likes of MinIO — the open source, container native object store — to, say, create machine learning data pipelines on Kubernetes? What is driving this trend to high-performance object stores? Kapoor breaks down the trends in the first half of the show.
Later in the show, Joab Jackson, The New Stack’s managing editor, gives us the highlights from the O’Reilly AI Conference in New York this week. Now that machine learning has firmly entered the corporate world, we need to find ways of making it robust, reliable and secure, advocated a number of speakers at that event.
Sponsor Note  KubeCon + CloudNativeCon conferences gather adopters and technologists to further the education and advancement of cloud native computing. The vendor-neutral events feature domain experts and key maintainers behind popular projects like Kubernetes, Prometheus, Envoy, CoreDNS, containerd, and more. At the conference, it was Massachusetts Institute of Technology faculty member Aleksander Madry who first called for AI 2.0 (though the term is probably inevitable in this industry, we suppose). Today’s AI is not nearly robust enough, insufficiently secure, and still way too unpredictable. The next generation of the technology must be “much more aligned with what we humans see as significant,” he said during his keynote.
And indeed, many of the talks, presentations and sponsor booths were centered around the idea of making AI more mature. In one presentation, Microsoft data scientists Fidan Boylu Uz and Mathew Salvaris demonstrated three ways to do Kubernetes-based Deep Learning in a production setting. One approach involved using Kubectl as a launching point — this approach offers the most flexibility for those who know how to manage K8s. Another method would be to use KubeFlow, a Google project to package the whole AI pipeline. This approach would be best for a research scientist who just wants to use their favorite libraries, such as TensorFlow and PyTorch. And lastly, was, of course, the Microsoft AzureML service, which was the easiest to deploy, as it does a lot of the configuration and build work itself, though, unlike the other approaches, you are limited to using Microsoft Azure as your cloud.
Still, social media companies are getting immense criticism for how their AI algorithms tend to surface more extreme, and outright toxic content. There still needs to be a reconciliation between what the machines suggest as the best answers, and what we humans consider acceptable. One factor could be a lack of diversity in the workforce. The less inclusive — and more homogeneous — the development team is that is building AI, the more likely the AI will contain unintentional biases, Dataiku’s Kurt Muehmel pointed out in his own talk on AI Ethics.

Feature image by Faizal Sugi from Pixabay.
The New Stack is a wholly owned subsidiary of Insight Partners, an investor in the following companies mentioned in this article: Docker, Dataiku. -- So many immigrant groups have swept through our town that Brooklyn, like Atlantis, reaches mythological proportions in the mind of the world - RI Safir 1998 http://www.mrbrklyn.com DRM is THEFT - We are the STAKEHOLDERS - RI Safir 2002
http://www.nylxs.com - Leadership Development in Free Software http://www.brooklyn-living.com
Being so tracked is for FARM ANIMALS and extermination camps, but incompatible with living as a free human being. -RI Safir 2013 _______________________________________________ Hangout mailing list Hangout-at-nylxs.com http://lists.mrbrklyn.com/mailman/listinfo/hangout
|
|