R World News

By boB Rudis & Jay Jacobs

About this podcast   English

A weekly digest of latest packages, papers, blog posts and happenings in the R world, plus book reviews and interviews with those involved in the R community.
In this podcast

R World News

boB Rudis

Machine generated. There may be errors. Report errors to us.
May 14, 2016
R World News - Episode 2 R 3.3.0 is here with a host of new features, many aimed at those that do R package development. (http://r.789695.n4.nabble.com/R-3-3-0-is-released-td4720368.html). The repr package (https://cran.r-project.org/web/packages/repr/README.html) by Philipp A., Thomas Kluyver, Jan Schulz, abielr, Denilson Figueiredo de Sa, Jim Hester & karldw is on CRAN. While it was designed to support the IRKernel (the R kernel) of IPytyon and (now) Jupyter notebooks, it provides core functionality to reliably create readable text (and viewable image) representations of data without the side effects print() can cause, such as invoking a pager and plotting to a plot device. In other words, all repr functions and methods are pure. It will also be able to enhance knit output and potentially replace R CMD Rd2pdf. acs (https://cran.r-project.org/web/packages/acs/README.html) is at version 2.0 and provides a general toolkit for downloading, managing, analyzing, and presenting data from the U.S. Census, including SF1 (Decennial short-form), SF3 (Decennial long-form), and the American Community Survey (ACS). Confidence intervals provided with ACS data are converted to standard errors to be bundled with estimates in complex acs objects. Package provides new methods to conduct standard operations on acs objects and present/plot data in statistically appropriate ways. Current version is 2.0 +/- .033. Support for the 2014 ACS is now provided (no more bare API calls!) ShinyDevCon happened earlier this year and the vides are now up! (https://blog.rstudio.org/2016/05/05/shinydevcon-videos-now-available/), We bring on Bhaskar Karambelkar (https://twitter.com/bhaskar_vk) to give us the attendee's view of what to watch first. Finally we talk about Docker Beta for Mac OS X & Windows and how it fits super-well into the R ecosystem thanks to rocker (https://github.com/rocker-org/rocker).
May 9, 2016
R World News - Episode 1 The forecast package (https://cran.rstudio.com/web/packages/forecast/index.html) by Rob Hyndman is now at version 7.1 and has includes bug fixes along with some improvements, such as support for multivariate linear models. One of the more notable additions is built-in support for plotting forecast objects using ggplot2. There are 11 autoplot() S3 methods for various forecast objects, 8 “gg” methods for directly starting ggplot2 plot constructs, 2 fortify() methods for transforming forecast objects into data.frames and a new geom_forecast() which lets you easily incorporate forecast object with other geom_s or annotations. The ggnetwork package (https://briatte.github.io/ggnetwork/) by Francois Briatte has made the jump from devtools into CRAN and provides support for the graphical display of virtually anything you can build with the network or igraph packages. This is the second package featured in today’s episode that takes advantage of the newly enhanced object model of ggplot2, making it straightforward to add scales, Geoms, Stats and even Coords (coordinate systems). The package authors provide a number of Geoms and themes, including the core ones: geom_edges() and geom_nodes(). rprojroot (http://krlmlr.github.io/rprojroot/) is a new utility package by Kirill Müller designed to ease the pain of referencing scripts or files in project subdirectories. Whether you’re building a package, working in an RStudio project or just in a git-managed directory, rprojroot has a simple interface to finding the directory root and letting you make the subdirectory & file references from that point. As the package author says, this solves a seemingly trivial but annoying problem that most of us encounter at one time or another. The next two packages work great together when you want to process a corpus or three in R. tokenizers (https://cran.rstudio.com/web/packages/tokenizers/index.html), by Lincoln Mullen & Dmitriy Selivanov, provides a consistent interface for breaking up a corpus into components such as n-grams, words, word stems, lines, sentences, paragraphs and more. It uses the robust stringi package for much of its core functionality and it returns plain R vectors vs custom objects, making the transformed texts easy to use and manipulate. The tidytext package (https://cran.rstudio.com/web/packages/tidytext/index.html) by Julia Silge, David Robinson & Gabriela De Queiroz uses the tokenizers (and a few other packages) to tranform a corpus into tidy data.frames that enable the use of dplyr and dplyr-like idioms in further processing. tidytext also provides tools for sentiment analysis and transforming objects to/from term/document matrix objects. Finally, Kurt Hornik & Florian Schwendinger beat hrbrmstr to the next package: pandocfilters (https://cran.rstudio.com/web/packages/pandocfilters/README.html). This works with something called the abstract syntax tree (AST) generated each time pandoc is called to transform one document format to another. The AST a JSON file with a node for each token. You can write transformation functions for one or more node types in plain R code and then have pandoc process the resultant, modified AST into the desired format. One of the basic examples shown by the authors is to write a filter to transform all text nodes to lower-case, but you can do anything to any node type and even create ASTs from scratch, all with R code. We suspect this will be something that can be easily used with knitr/rmarkdown in the not-too-distant-future. Plus a featurette on the feather (https://blog.rstudio.org/2016/03/29/feather/) package.
Disclaimer: The podcast and artwork embedded on this page are from boB Rudis & Jay Jacobs, which is the property of its owner and not affiliated with or endorsed by Listen Notes, Inc.