sábado, 14 de janeiro de 2017

Reuse: Statistical Data Processing

When in need of statistical data analysis, we will need to implement or reuse system components into our system.
Microsoft has been making available the (free until a certain point) open-source R statistical programming language (and servers):
http://www.infoworld.com/article/3156544/big-data/microsofts-r-tools-bring-data-science-to-the-masses.html


Quoting (the article focus on a new MS acquisition but provides us with an overview of the R programming language and the Development environment to support developers and Data Scientists):


"One of Microsoft’s more interesting recent acquisitions was Revolution Analytics, a company that built tools for working with big data problems using the open source statistical programming language R. Mixing an open source model with commercial tools, Revolution Analytics offered a range of tools supporting academic and personal use, alongside software that took advantage of massive amounts of data–including Hadoop. Under Microsoft’s stewardship, the now-renamed R Server has become a bridge between on-premises and cloud data.
Two years on, Microsoft has announced a set of major updates to its R tools. The R programming language has become an important part of its data strategy, with support in Azure and SQL Server—and, more important, in its Azure Machine Learning service, where it can be used to preprocess data before delivering it to a machine learning pipeline. It’s also one of Microsoft’s key cross-platform server products, with versions for both Red Hat Linux and Suse Linux."