头图

In November 2021, Microsoft open sourced a simple, multilingual, massively parallel machine learning library SynapseML (formerly MMLSpark) to help developers simplify the creation of machine learning pipelines. For details, see Microsoft's deep learning library SynapseML: 45 different machine learning services can be directly embedded in the system, and text translation in more than 100 languages is supported .

August 12, 2022 Microsoft released SynapseML for .NET on the .NET Blog, building on its large-scale machine learning open source project SynapseML, which debuted last November. As part of the new SynapseML v0.10 release, Microsoft announced a new set of .NET APIs for massively scalable machine learning. "This allows us to author, train, and use any SynapseML model from C#, F#, or other languages in the .NET family through the .NET for Apache Spark language bindings," the blog post said.


Researcher at Microsoft MVP Labs

图片

Zhang Shanyou<br>Shenzhen Youhaoda CTO & Chief Architect, Microsoft Most Valuable Expert, Microsoft SWAT Expert, more than 20 years of software R&D experience, worked in Tencent for 12 years, member of .NET Foundation, .NET Cloud Native Consultant and Solution Solution expert, operating WeChat public account "dotnet cross-platform" and "distributed application runtime"

图片


SynapseML runs on Apache Spark and requires Java to be installed because Spark uses the JVM to run Scala. However, it has bindings for other languages like Python or R. The current 0.10.0 release adds bindings for .NET languages. The tool helps developers build scalable intelligent systems in a variety of Microsoft domains, including:

  • deep learning
  • model interpretability
  • computer vision
  • Reinforcement Learning and Personalization
  • abnormal detection
  • search and retrieval
  • Form and face recognition
  • speech processing
  • Gradient boosting
  • Text Analysis
  • Microservice Orchestration
  • Translate Microsoft

When the project was first open-sourced last year, it was said that "a unified API standardizes many of today's tools, frameworks and algorithms, simplifying the distributed ML experience, which enables developers to rapidly build different ML frameworks for use cases that require multiple frameworks. , such as web supervised learning, search engine creation, etc. It can also train and evaluate models on single-node, multi-node, and elastically resizable computer clusters, so developers can scale their work without wasting resources .". Is this sentence familiar to students who are familiar with Dapr , another open source project donated to CNCF by Microsoft?

SynapseML for .NET is included in a set of SynapseML NuGet packages. These packages have not been published to the main NuGet source, and their sources must be added manually. Once installed, the SynapseML API can be called from a .NET application.

The following code snippet illustrates how to call the SynapseML API from a C# application.

 // Create LightGBMClassifier
var lightGBMClassifier = new LightGBMClassifier()
     .SetFeaturesCol("features")
     .SetRawPredictionCol("rawPrediction")
     .SetObjective("binary")
     .SetNumLeaves(30)
     .SetNumIterations(200)
     .SetLabelCol("label")
     .SetLeafPredictionCol("leafPrediction")
     .SetFeaturesShapCol("featuresShap");
// Fit the modelvar lightGBMClassificationModel = lightGBMClassifier.Fit(trainDf);
// Apply transformation and displayresults
lightGBMClassificationModel.Transform(testDf).Show(50);

SynapseML allows developers to call other services in their pipeline. The library supports Microsoft's own Cognitive Services , a set of generic AI services powered by models trained by Microsoft. Additionally, the current version of SynapseML allows developers to leverage pretrained OpenAI models in their solutions, such as GPT-3 for natural language understanding and generation, and Codex for code generation. Currently using OpenAI models requires access to the Azure OpenAI service.

Finally, the current release adds support for MLflow , a platform for managing the ML lifecycle. Developers can use it to load and save models, and log messages during model execution.

There is now a new member SynapseML in the .NET machine learning community:

  • ML.NET is a .NET library for running stand-alone workloads using .NET languages:

  • Apache Spark for .NET : Provides .NET support for the Apache Spark distributed computing framework
  • Microsoft Cognitive Toolkit (CNTK) is a Microsoft ML library. It also has a .NET API which he has stopped developing.
  • Accord.NET , a .NET machine learning library for vision and audio processing, has been discontinued.
  • In the .NET community, there is confusion among developers about how all these libraries compare to each other or whether they replace each other. SynapseML project members actively answer these questions on Reddit .
    图片

SynapseML is built on the Apache Spark for .NET project, which provides .NET support for the Apache Spark distributed computing framework. Apache Spark is written in Scala, a language on the JVM, but has language bindings for Python, R, .NET, and others. This release adds full .NET language support for all models and learners in the SynapseML library, so you can author distributed machine learning pipelines in .NET for execution on Apache Spark clusters.


Microsoft Most Valuable Professional (MVP)

图片

The Microsoft Most Valuable Professional is a global award given to third-party technology professionals by Microsoft Corporation. For 29 years, technology community leaders around the world have received this award for sharing their expertise and experience in technology communities both online and offline.

MVPs are a carefully selected team of experts who represent the most skilled and intelligent minds, passionate and helpful experts who are deeply invested in the community. MVP is committed to helping others and maximizing the use of Microsoft technologies by Microsoft technical community users by speaking, forum Q&A, creating websites, writing blogs, sharing videos, open source projects, organizing conferences, etc.

For more details, please visit the official website: https://mvp.microsoft.com/zh-cn


Head over to the original blog to learn more about SynapseML for .NET .


微软技术栈
423 声望995 粉丝

微软技术生态官方平台。予力众生,成就不凡!微软致力于用技术改变世界,助力企业实现数字化转型。