For the Love of Data

By For the Love of Data

About this podcast   English    United States

For the Love of Data is a monthly podcast devoted to all things data from industry news, new products, and cool data visualizations. Host Robert Furr and others hold discussions, interviews, reviews, and arguements to determine where the information technology industry is heading, with an emphasis on Business Intelligence (BI), Information Management (IM), and data analytics. Topics like data science, analytics, strategy, and governance are just a few of the topics on the table. SQL, NoSQL, Tableau, R, Oracle, MySQL, SQL Server... these are just a few of many tools we will noodle on during each episode.
In this podcast

Tech News

Love of Data

Technology

Machine generated. There may be errors. Report errors to us.
March 30, 2018
  Introductory Product Models: Only partner implementations (BluePrism) Limited Features (WorkFusion) Customer Revenue Limited (UI Path) Single License (Softomotive) Music Deep Sky Blue by Graphiqs Groove via FreeMusicArchive.org Sources: https://irpaai.com/definition-and-benefits/ https://www.edgeverve.com/wp-content/uploads/2017/02/forrester-wave-robotic-process-automation.pdf https://www.gartner.com/doc/reprints?id=1-3U26FK2&ct=170222&st=sb http://www.uipath.com/hubfs/News_photos/Forrester_Wave_RPA_Report.png?t=1522186102828 http://images.abbyy.com/India/market_guide_for_robotic_pro_319864%20(002).pdf https://www.uipath.com/community https://www.workfusion.com/rpaexpress https://idm.net.au/article/0011800-which-rpa-software-should-i-use
Feb. 28, 2018
Thank you my friend and fellow Capco cohort, Dargah Fitzpatrick, for joining me on this episode of FTLOD where we cut through the Hype of AI to understand some of the key challenges and opportunities facing consumers and businesses alike when working with or alongside AI. Given that we’re talking about AI, I also have a twist for today’s interview–transcription! Today’s episode is transcribed here using machine learning from webASR, a free service provided through the University of Sheffield’s Machine Intelligence for Natural Interfaces (MINI). Note: The transcription is wonderful as a starting point and for a free service, but it does diverge from the actual conversation fairly significantly at times. Please listen to the episode as you read along. Topics: Definition of AI and the singularity–should we be concerned? What’s going on in the AI space? Typical use cases in industry RPA vs. AI and different use cases Recommendation systems Challenges in profiling users or customers Ethical challenges and consequences of bad AI or black box AI AI is like fire: it can be highly useful, but it can also be a weapon and burn you. Perceptions of AI that are overhyped Not every product or service needs AI to be good At what point does intelligence begin? The fourth industrial revolution and its impact on society How to responsibly introduce life-altering AI Will AI supplement our lives and give us better quality of live, or will it make us do more, faster, stronger? Advancements in how AI plays the game, Dover Some of the items we discuss are available in the following places: https://blog.1871.com/the-1871-fintech-forum-a-discussion-around-the-reality-of-todays-automation-practices https://blog.1871.com/1871-fintech-forum-future-of-data-and-analytics https://samharris.org/podcasts/116-ai-racing-toward-brink/ http://fortune.com/2018/02/20/nasdaq-delist-long-blockchain-bitcoin-iced-tea/ https://en.wikipedia.org/wiki/Fourth_Industrial_Revolution Music Deep Sky Blue by Graphiqs Groove via FreeMusicArchive.org
Jan. 31, 2018
Background: Google used learned indexes , machine learning models, to access data and compared these to B-Tree, Hash, and Bloom Filter indices Trained a model using multiple stages where the earlier stages could approximate a location and later stages would work with a subset to improve accuracy. Each stage could choose a different model to advance the search further. FYI, the diagram below looks like a decision tree, but it is not. Each stage/model could have different distributions and could repeat the model used above or below. They achieved access time and space savings across the board, even without using GPUs or TPUs (Tensor Processing Units) “Retraining the model” – the tests were performed on a static data set, so no retraining or index maintenance was required. Observations / Questions: Used Tensorflow with Python as the front end — apparently a lot of initial overhead with this as a test stack. B-Tree indexes to some extent are a model, especially if they don’t store every key and instead store the first key in a page. The paper made some rudimentary assumptions, such as using a random hash function. What if the data is not static? How long would it take to retrain the model vs. maintain an index? What if data profiling caused you to index certain attributes and not others? What are the best practices with this newer approach The power of being able to use different models at different stages is intriguing. You could also potentially maintain traditional indexes as a backup / failsafe that would upper bound to the performance of a B-Tree. Load times – The folks from Google commented that they could retrain a simple model on a 200M data set in “just [a] few seconds if implemented in C++” Recursive question: do you need an optimizer to optimize the optimization path? Room for improvement: GPUs/TPUs Incorporating common queries into the model to know what questions people are asking Music Deep Sky Blue by Graphiqs Groove via FreeMusicArchive.org Sources: http://learningsys.org/nips17/assets/papers/paper_22.pdf http://blog.ezyang.com/2017/12/a-machine-learning-approach-to-database-indexes-alex-beutel/ https://towardsdatascience.com/what-if-i-told-you-database-indexes-could-be-learned-6cf8f59bff94 https://www.arxiv-vanity.com/papers/1712.01208/ https://www.oreilly.com/ideas/how-machine-learning-will-accelerate-data-management-systems
Dec. 30, 2017
This episode reflects on some of the hottest topics from 2017 and their impact their data has on our lives this year and into 2018. Cryptocurrency Many of these data points come from here. Since the year began, the aggregate market cap of all cryptocurrencies combined has increased by more than 3,200% as of Dec. 18 Bitcoin went through the roof, hitting an all-time high of 1 BTC = $19,891 on 12/17/2017. BTC makes up 54% of the aggregate $589 billion market cap of all cryptocurrencies The graphics-card hardware needs of miners has been a big reason why NVIDIA and Advanced Micro Devices have seen a double-digit percentage surge in sales recently Back on Dec. 10, CBOE Global Markets (NASDAQ:CBOE) became the first to introduce bitcoin futures trading, with CME Group (NASDAQ:CME) following a week later 612 new cryptocurrencies began trading in 2017 Top 10 cryptocurrencies in 2017 as of 12/29 according to BitInfoCharts.com (pretty similar list on AtoZForex.com): Cryptocurrency Price in USD Price in BTC First Trade Exchange volume 24h BTC Bitcoin $ 15,030.33 +9.79% ($1,340) in 12h +9.56% ($1,312) in 7d 1 BTC +0% in 12 hours +0% in 7 days 2010-07-17 100,317 BTC 100,316.59 BTC 1,250,728,823.58 USD XRP Ripple $ 1.4 +11.79% ($0.15) in 12h +28.92% ($0.31) in 7d 0.000093 BTC +1.82% in 12 hours +17.67% in 7 days 2014-08-14 462,239,606 XRP 36,699.78 BTC 551,610,001.98 USD ETH Ethereum $ 750.82 +9.3% ($63.9) in 12h +12.07% ($80.9) in 7d 0.05 BTC -0.45% in 12 hours +2.29% in 7 days 2014-09-30 784,632 ETH 34,510.75 BTC 518,708,138.27 USD BCH Bitcoin Cash $ 2,571.87 +8.04% ($191) in 12h +6.17% ($149) in 7d 0.171 BTC -1.59% in 12 hours -3.1% in 7 days 2017-08-01 209,597 BCH 33,824.21 BTC 508,389,215 USD LTC Litecoin $ 255.88 +11.54% ($26.5) in 12h +0.89% ($2.26) in 7d 0.017 BTC +1.59% in 12 hours -7.91% in 7 days 2012-07-13 1,156,615 LTC 18,070.26 BTC 271,602,057.82 USD IOT IOTA $ 3.87 +11.04% ($0.38) in 12h +2.12% ($0.08) in 7d 0.00026 BTC +1.14% in 12 hours -6.8% in 7 days 2017-08-30 37,838,946 IOT 9,288.14 BTC 139,603,822.48 USD XMR Monero $ 366.67 +7.1% ($24.3) in 12h +9.42% ($31.6) in 7d 0.024 BTC -2.46% in 12 hours -0.13% in 7 days 2014-06-04 275,568 XMR 6,354.54 BTC 95,510,916.21 USD DASH Dash $ 1,126.09 +10.81% ($110) in 12h +2.5% ($27.5) in 7d 0.075 BTC +0.93% in 12 hours -6.45% in 7 days 2014-02-20 85,797 DASH 6,073.26 BTC 91,283,097.97 USD XVG VERGE $ 0.168 +41.89% ($0.05) in 12h +60.02% ($0.06) in 7d 0.000011 BTC +29.23% in 12 hours +46.05% in 7 days 2016-02-18 606,321,139 XVG 5,940.19 BTC 89,283,020.8 USD ICX ICON $ 5.73 +4.78% ($0.26) in 12h +183.99% ($3.72) in 7d 0.00038 BTC -4.56% in 12 hours +159.21% in 7 days 2017-11-11 13,061,177 ICX 5,072.57 BTC 76,242,347.41 USD Data Breaches Equifax – 9/7/2017 – 143mm US consumers affected Stock plunged nearly $4bn in the aftermath https://www.equifaxsecurity2017.com/ RNC Voter List – nearly every registered voter, ~200mm Americans Yahoo’s 2013 breach revelation – affected accounts went from 1bn to 3bn Uber – 57mm user accounts and drivers, paid to keep it under wraps 560mm Passwords – a massive list of 560mm credentials compiled into one database of breaches from at least 10 services You can check if your account is part of a compromise at have i been pwned or SpyCloud.   World Affairs The World Bank has a fascinating article with 12 charts covering food assistance, climate change, education, nutrition, elections, energy and a tribute to Hans Rosling, who made us see the world in new ways with breathtaking visualizations. Other Data Tidbits Most popular Instagram Post: Beyonce – https://www.instagram.com/p/BP-rXUGBPJa/ Most retweeted Twitter Post: Carter’s quest for Wendy’s Chicken Nuggest – https://twitter.com/carterjwm/status/849813577770778624/photo/1 Oracle bought API management firm Apiary. Be on the lookout for how that evolves for the tool and for Oracle RPA saw continued growth and implementations. Expect more in 2018. Kubernetes is becoming the de facto standard for container management and was upgraded to Adopt by TechRadar. Expect it to continue to gain steam and start influencing data solutions more in 2018. Music: Auld Lang Syne by Fresh Nelly, from Free Music Archive. Sources: https://www.coindesk.com/price/ https://www.investing.com/currencies/btc-usd-historical-data https://bitinfocharts.com/new-cryptocurrencies-2017.html https://atozforex.com/news/top-10-cryptocurrency-2017/ https://www.fool.com/investing/2017/12/19/16-cryptocurrency-facts-you-should-know.aspx http://cryptocurrencyfacts.com/ https://gizmodo.com/the-great-data-breach-disasters-of-2017-1821582178 https://www.equifaxsecurity2017.com/ http://clark.com/personal-finance-credit/equifax-data-breach-a-look-back-at-our-biggest-story-of-2017/ http://beta.latimes.com/business/hiltzik/la-fi-hiltzik-equifax-breach-20170908-story.html https://haveibeenpwned.com/ https://www.instagram.com/p/BP-rXUGBPJa/ https://www.usnews.com/news/national-news/articles/2017-12-12/twitters-top-10-most-retweeted-tweets-of-2017 http://www.worldbank.org/en/news/feature/2017/12/15/year-in-review-2017-in-12-charts https://www.youtube.com/watch?v=YpKbO6O3O3M https://www.informationweek.com/strategic-cio/digital-business/2017-year-in-review—exponential-automation/a/d-id/1330648? https://www.thoughtworks.com/radar/platforms/kubernetes  
Nov. 30, 2017
Zip file of all the sample data, Maestro flows, and Tableau workbook I used to get a first impression: E022_maestro_demo_files. Screenshots Sample Flow from Tableau Field Selection Data Profiling Filters Join Clause Refresh / Run Flow File Output Options Pros: Has the clean, intuitive feel of Tableau. I did my hands-on test with no training or previous exposure Lots of features for a first release – joins, unions, type conversion, calculated fields, data connectors, etc. Easy to click into any part of your flow and see data Ability to edit inline – much like tweaking an Excel pivot table Data profiling is a nice visual cue to begin working with data Ability to sort, filter, rename, add calculated fields anywhere along the way Great for quick and dirty data prep that you know is heading into Tableau for ad-hoc analysis Cons: Ability to sort, filter, rename, add calculated fields anywhere along the way – this can get messy for others to come behind you to maintain or see what is happening Reconciliation issues between reports will now be complicated by similar flows doing slightly different things You have to remove header fields from Excel if you want Maestro to latch onto and display field names from table. By default, it looks at first row and gives generic names if column headings aren’t there (i.e., F1, F2, …) Can only have one flow open at any time Performance seems a tiny bit slow on my example with ~13,000 rows. Curious to see how it will perform against larger data sets, RDBMS, and big data connectors Only outputs to TDE or Hyper formats currently. No ability to save as CSV, XLSX, PDF, or write back to a data store Unable to source data from a TDE or Tableau Workbook No reuse of common transformations or logic across different flows NO community generated content yet – since it is very new, you can’t Google for answers or YouTube videos. Established, mature ETL and data prep tools will continue to have a leg up on this front for a while. Music Deep Sky Blue by Graphiqs Groove Sources: https://www.tableau.com/project-maestro https://prerelease.tableau.com/ https://www.eia.gov/electricity/data/eia923/ https://www2.census.gov/programs-surveys/popest/tables/2010-2016/state/totals/nst-est2016-01.xlsx
Oct. 31, 2017
Just in time for Halloween this year, we take a look at the way people will spend their money on Candy and other goods during this spooky time. Spending People in the US are expected to spend $9.1 billion on Halloween this year, according to a study by the National Retail Federation. Several predictions about this year’s Halloween season include: U.S. consumers are projected to drop $82.93 on average, up almost 12 percent from $74.34 last year. More than 171 million consumers are expected take part in Halloween festivities. Adults ages 18-34 are projected to spend on average $42.39, compared with $31.03 for all adults. According to the survey, consumers plan to spend: $3.4 billion on costumes (purchased by 69 percent of Halloween shoppers), $2.7 billion on candy (95 percent), another $2.7 billion on decorations (72 percent) and $410 million on greeting cards (37 percent). Among Halloween celebrants: 71 percent plan to hand out candy, 49 percent will decorate their home or yard, 48 percent will wear costumes, 46 percent will carve a pumpkin, 35 percent will throw or attend a party, 31 percent will take their children trick-or-treating, 23 percent will visit a haunted house and 16 percent will dress pets in costumes. Top Costumes More than 3.7 million children plan to dress as their favorite action character or superhero, 2.9 million as Batman characters and another 2.9 million as their favorite princess while 2.2 million will dress as a cat, dog, monkey or other animal. Proving that Halloween isn’t just for kids, a record number of adults (48 percent) plan to dress in costume this year. More than 5.8 million adults plan to dress like a witch, 3.2 million as their favorite Batman character, 3 million as an animal (cat, dog, cow, etc.), and 2.8 million as a pirate. Pets won’t be left behind when it comes to dressing up for Halloween. Ten percent of pet lovers will dress their animal in a pumpkin costume, while 7 percent will dress their cat or dog as a hot dog and 4 percent as a dog, lion or pirate. Candy CandyStore.com released data from 10 years of bulk candy online sales that show favorite candies by state. STATE TOP CANDY POUNDS 2ND PLACE POUNDS 3RD PLACE POUNDS TX Starburst 1952361 Reese’s Cups 1927663 Almond Joy 837525   STATE TOP CANDY POUNDS 2ND PLACE POUNDS 3RD PLACE POUNDS AL Candy Corn 55274 Hershey’s Mini Bars 54369 Tootsie Pops 42533 AK Twix 4678 Blow Pops 4578 Kit Kat 3892 AZ Snickers 904633 Hershey Kisses 817463 Hot Tamales 527843 AR Jolly Ranchers 225990 Butterfinger 215897 Hot Tamales 89027 CA M&M’s 1548990 Salt Water Taffy 1345782 Skittles 1034527 CO Milky Way 5620 Twix 5478 Hershey Kisses 4087 CT Almond Joy 2457 Milky Way 1985 M&M’s 1023 DE Life Savers 20748 Skittles 18072 Candy Corn 10217 FL Skittles 630938 Snickers 587385 Reese’s Cups 224637 GA Swedish Fish 130647 Hershey Kisses 109672 Jolly Ranchers 55049 HI Skittles 267872 Hershey Kisses 264728 Milky Way 139874 ID Candy Corn 85903 Starburst 60826 Reese’s Cups 39847 IL Sour Patch Kids 155782 Kit Kat 151786 Reese’s Cups 95627 IN Hot Tamales 95092 Starburst 78920 Snickers 34589 IA Reese’s Cups 58974 M&M’s 53982 Butterfinger 25782 KS Reese’s Cups 231476 M&M’s 230082 Dubble Bubble Gum 159092 KY Tootsie Pops 67829 3 Musketeers 60273 Reese’s Cups 30865 LA Lemonheads 102833 Reese’s Cups 89738 Jolly Ranchers 45092 ME Sour Patch Kids 58290 M&M’s 45938 Starburst 16782 MD Milky Way 38782 Reese’s Cups 30748 Blow Pops 12093 MA Sour Patch Kids 75638 Butterfinger 73892 Salt Water Taffy 45982 MI Candy Corn 146782 Skittles 135982 Starburst 87740 MN Tootsie Pops 195783 Skittles 194672 Almond Joy 98726 MS 3 Musketeers 109783 Snickers 103993 Butterfinger 57829 MO Milky Way 42739 Dubble Bubble Gum 34751 Butterfinger 24780 MT Dubble Bubble Gum 24675 M&M’s 14673 Twix 13784 NE Sour Patch Kids 106728 Salt Water Taffy 78624 M&M’s 23674 NV Hershey Kisses 322884 Candy Corn 203746 Skittles 167837 NH Snickers 63876 Starburst 62468 Salt Water Taffy 25987 NJ Skittles 159324 Tootsie Pops 157893 M&M’s 110673 NM Candy Corn 83562 Milky Way 65682 Jolly Ranchers 45721 NY Sour Patch Kids 200008 Candy Corn 101292 Reese’s Cups 56776 NC M&Ms 96110 Reese’s Cups 95763 Candy Corn 62308 ND Hot Tamales 65782 Jolly Ranchers 61829 Candy Corn 51827 OH Blow Pops 150324 M&M’s 146782 Starburst 105752 OK Snickers 20938 Dubble Bubble Gum 10283 Butterfinger 8892 OR Reese’s Cups 90826 M&M’s 67626 Tootsie Pops 42774 PA M&M’s 290762 Skittles 281847 Hershey’s Mini Bars 150372 RI Candy Corn 17862 M&M’s 13894 Twix 9003 SC Candy Corn 114783 Skittles 98782 Hot Tamales 41892 SD Starburst 24783 Jolly Ranchers 22983 Candy Corn 7827 TN Tootsie Pops 59837 Salt Water Taffy 34859 Skittles 20938 TX Starburst 1952361 Reese’s Cups 1927663 Almond Joy 837525 UT Jolly Ranchers 475221 Reese’s Cups 29823 Tootsie Pops 198564 VT Milky Way 29837 M&M’s 27811 Skittles 17662 VA Snickers 26783 Hot Tamales 26178 Candy Corn 18726 WA Tootsie Pops 223850 Salt Water Taffy 210981 Hershey Kisses 78662 DC M&M’s 26092 Tootsie Pops 21364 Blow Pops 14763 WV Blow Pops 43776 Hershey’s Mini Bars 23554 Milky Way 18911 WI Starburst 116788 Butterfinger 115982 Jolly Ranchers 42998 WY Reese’s Cups 32889 Salt Water Taffy 26555 Skittles 20812   FiveThirtyEight took a different approach by analyzing data from 269,000 head-to-head matchups between candies. Their findings: Reese’s took 4 of the top 10 spots! They boiled it down into the following elements: Music In This Creepy, Sleepy Backward Town by Squire Tuck via Free Music Archive Sources https://nrf.com/media/press-releases/halloween-spending-reach-record-91-billion https://www.candystore.com/blog/facts-trivia/halloween-candy-map-popular/ https://www.candyindustry.com/blogs/14-candy-industry-blog/post/87484-halloween-scary-good-for-candy-sales http://fivethirtyeight.com/features/the-ultimate-halloween-candy-power-ranking/ http://freemusicarchive.org/music/Squire_Tuck/Happy_Halloween_1583/In_This_Creepy_Sleepy_Backward_Town_1_-_29102016_1146
Sept. 27, 2017
If you’re in crisis, text 741741 if you’re in the US to talk with a counselor now. In this episode we speak with the people behind Crisis Text Line and Crisis Trends, two services that use data to make a difference for those going through a crisis or looking for someone with whom to talk. Overview Texters contact the hotline by texting the shortcode 741741. Volunteers are logged onto “the platform”, which is on CTL’s internal site, to receive these messages and access counselor tools. Their data is collected in real time and is updated in close to real time: https://crisistrends.org/ This is the TED talk where the founder introduced her idea for the organization: https://www.ted.com/talks/nancy_lublin_texting_that_saves_lives This is the TED talk 3 years later where the founder shared an update on CTL’s success and shared information about how they use data intelligently on their platform: https://www.ted.com/talks/nancy_lublin_the_heartbreaking_text_that_inspired_a_crisis_help_line Key Stats Over 1 million messages transmitted per month 75% of texters are under 25 10% under age 13 65% say they have shared something with Crisis Text Line that they haven’t shared with anyone else Usually at least one active rescue per day Take people based on severity and have the ability to initiate an active rescue (via 911) Words like ibuprofen, aspirin, tylenol are more indicative of active rescue need than the words die, overdose, suicide emoji is 4x more of an indicator Roots of CTL go back to 1906 when Save-A-Life League started via newspaper ads The Samaritans was the first phone suicide hotline and started in November 1953 Founded by Nancy Lublin, who is also the CEO of DoSomething.org, in 2011 Introductions – background, how they got their start, how they got involved in CrisisTextLine Staci – volunteer Scotty – Data Scientist History of Crisis Text Line and high-level structure (where they operate, # of locations, # of employees / volunteers) Staci’s experience What was training like? Where do she take sessions and how often? How do she feel after a session? Her experience as a counselor and thoughts on the impact, data, etc. What ways they collect data #s of texters UI platform for counselors Types of data they collect Types of technologies used to collect/manage it – both publicly, behind the scenes, for presentations, etc. What ways they use data CrisisTrends.org site Anonymity, opt-in/opt-out options and how frequent each occur Key stats they feel are most important/surprising/alarming, etc. How has data made an impact to those in need? How has data made an impact to counselors? How has data made an impact to the organization? How has data made an impact to the crisis advocacy sector as a whole? What ways can other people can use their data Do they encourage that visitors explore to find their own insights? Will data be available by zip code at some point? Data Science What tools and techniques do they see being most important in the near term? What do they see as becoming less important in the near term? What is something they could have told their earlier selves that would have made their path to this point easier? Organization Info How someone can get involved What they need most What is in store for the future? New technologies, platforms for contact, etc. How someone can contact them Music Deep Sky Blue by Graphiqs Groove Sources https://youtu.be/KOtFDsC8JC0 – TED talk about origin https://www.crisistextline.org/ https://crisistrends.org/ http://www.newyorker.com/magazine/2015/02/09/r-u
Aug. 20, 2017
Join me as I chat with my colleague and Cognos guru John Frazier about the latest release of Cognos, leading up to the anticipated release of the next version, 11.0.7, near the end of Q3. The latest version of Cognos (11.0.6) debuted on March 21, 2017. You can sign up for a perpetually free trial (like Tableau Online) here. Version 11 was originally released in December 2015 and was mainly a UI redesign on top of Cognos 10 features. Analysis and Query Studios will eventually be deprecated. New Features in 11 vs. 10 New UI – responsive web design on UI, but not on reports Better self-service capabilities and collaboration for teams Upload data files – upload delimited text or Excel files to be stored in a columnar format (Parquet) on the file system (not in memory or in the DB). These are immediately usable in dashboards and don’t require entry into FM. Data modules (intent based modeling based on Watson) similar to FM packages Note: Dashboards only use uploaded files and data modules Available on cloud Mobile and desktop from a single report Active reports as prompts Free cloud trial Admin console is unchanged New Features in 11.0.6 Mapping enhancements Multiple admin boundaries, add’l postal code support Dashboarding enhancements Direct access to OLAP packages (Framework packages accessible since 11.0.5) Widgets using data from the same source are connected by default New grid widget Color gradient by measure Date filters can include blanks Portal enhancements Share/embed through overflow menu Folder customizations can be done directly through the UI more easily (without uploading JSON configs) Create shortcuts and report views Storytelling enhancements New guided journey templates New animations (side fade, slide, scale, zoom, pivot) Better pins (smart named, better search and filter) Timelines – smart names Change scene template while working on your story/dashboard Reporting enhancements Better lineage support for FM packages Business glossary (w/IBM InfoSphere Information Governance Catalog integration) Better freeze list column heading control Better query support when editing data modules Report templates – can save for your team or save as style reference reports Support for Planning Analytics Dashboard support for TM1 / Planning cubes REST connectivity to planning analytics Support for attribute hierarchies Support for localized Planning Analytics cubes Data server enhancements Support for Google BigQuery and Google Cloud SQL via the BigQuery JDBC and MySQL JDBC drivers, respectively. JDBC URL for Data Server Connections Test connection feedback (this is not just in admin console now) John’s Likes/Dislikes with v11: For those who are “used” to ReportStudio there is a pretty “steep” learning curve to locate where particular tools or components have been moved. To be fair, ReportStudio had some counter-intuitive placements for some of these same tools (e.g. Hierarchy of design elements, etc.) that caused major headaches for new report designers. Overall the new interface is more “intuitive” and the novice report developers I’ve worked with have picked it up remarkably quickly. There are some changes that are really “nice” – like being able to see which Lists/Graphs use a particular query right from the query tree without having to “search” for where it is used on the “right click” menu. Music Deep Sky Blue by Graphiqs Groove Sources https://www.ibm.com/analytics/us/en/technology/products/cognos-analytics/ https://www.ibm.com/communities/analytics/cognos-analytics-blog/the-latest-release-of-cognos-analytics-is-here/ http://newintelligence.ca/top-12-reasons-to-upgrade-to-cognos-analytics-a-k-a-cognos-11/ https://www.ibm.com/support/knowledgecenter/SSEP7J_11.0.0/com.ibm.swg.ba.cognos.ca_new.doc/c_ca_nf_deprecated.html https://www.ibm.com/support/knowledgecenter/en/SSEP7J_11.0.0/com.ibm.swg.ba.cognos.ca_new.doc/c_ca_nf_11_0_x.html https://www.slideshare.net/senturus/cognos-analytics-version-11-questions-answered
July 30, 2017
What if you could store your data in the cloud, encrypted, for a fraction of the cost of Amazon S3, Google, or Azure? With Sia, a decentralized file storage solution that leverages blockchain, you can. Learn more about how it works in this episode. Blockchain Overview A blockchain is a permissionless distributed database that maintains a continuously growing list of transactional data records. The system’s design means it is hardened against tampering and revision, even by operators of the nodes that store data. The initial and most widely known application of the block chain technology is the public ledger of transactions for bitcoin, but its structure has been found to be highly effective for other financial vehicles. [Illustration by Matthäus Wander (Wikimedia)] Timestamp: The time when the block was found. Reference to Parent (Prev_Hash): This is a hash of the previous block header which ties each block to its parent, and therefore by induction to all previous blocks. This chain of references is the eponymic concept for the blockchain. Merkle Root (Tx_Root): The Merkle Root is a reduced representation of the set of transactions that is confirmed with this block. The transactions themselves are provided independently forming the body of the block. There must be at least one transaction: The Coinbase. The Coinbase is a special transaction that may create new bitcoins and collects the transactions fees. Other transactions are optional. Target: The target corresponds to the difficulty of finding a new block. It is updated every 2016 blocks when the difficulty reset occurs. The block’s own hash: All of the above header items (i.e. all except the transaction data) get hashed into the block hash, which for one is proof that the other parts of the header have not been changed, and then is used as a reference by the succeeding block. Sia Overview Decentralized network that places encrypted pieces of your data on dozens of notes Aims to be fastest, cheapest, most secure storage solution and compete with AWS, GCP, Azure Users pay in Siacoins, a cryptocurrency like Bitcoin Must go USD -> Bitcoin -> Siacoin -> Wallet -> File Upload Open source Started by David Vorick and Luke Champine through a VC backed Boston-based company called Nebulous Inc Origins in the HackMIT 2013 conference Uses ASICs (application specific integrated circuits) for mining These are purpose built integrated circuits, not general multi-use devices Evolution from CPU -> GPU – ASIC Faster and less vulnerable to attacks than GPUs Why? See here. Created a company to make ASICs called obelisk. ~$2,500 per machine Current price is about 124 Siacoin to $1USD Pros Decentralized, peer-to-peer Encrypted and immutable Hosts can earn money by renting free disk space to renters Must maintain 95% uptime to preserve collateral Possible Issues Renters uploading illegal content to hosts However, renters would have to pay for the bandwidth leechers use to download files Slow at this point Low number of users Music Deep Sky Blue by Graphiqs Groove Sources: Website: http://sia.tech/ Wiki: https://siawiki.tech/ Github: https://github.com/NebulousLabs Slack: https://slackin.sia.tech/ Forum: https://forum.sia.tech/ Sia vs. USD: http://siapulse.com/page/market How to get started: https://blog.sia.tech/getting-started-with-private-decentralized-cloud-storage-c9565dc8c854 Conversion: https://www.coingecko.com/en/price_calculator/siacoin/usd https://bitcoin.stackexchange.com/questions/12427/can-someone-explain-how-the-bitcoin-blockchain-works https://bitcoin.stackexchange.com/questions/12427/can-someone-explain-how-the-bitcoin-blockchain-works
June 29, 2017
In this episode we cover the new features in Tableau 10.3. This version debuted on May 31st, and a 10.3.1 update was released on 6/21/17. Data Driven Alerts Only on Tableau Server Receive an alert when a mark crosses a visual threshold Can use on any viz with a continuous numeric axis Can sign up yourself and others; then each person can self-administer Default check rate is 60 minutes or when an extract is refreshed. Can customize with this command: tabadmin set dataAlerts.checkIntervalInMinutes tabadmin restart Tableau Bridge – Limited Release Connect to live, on-premise data from Tableau Online Replaces the sync client – is basically the sync client + live query functionality. Client is installed and ran behind your firewall and pushes data to Tableau Online. Live connections must be enabled by administrators. Limited to RDBMSs (MySQL, SQL Server, etc.) Oracle cloud hosted DBs must use Tableau Bridge Must run as a service to enable live connections Must embed credentials in Tableau Bridge if you want it to automatically update on a schedule Will restart every hour minimum. You can set this window with this command: tabonlinesyncclientcmd.exe SetDataSyncRestartInterval –restartInterval=<value in seconds> Best Practices (https://www.tableau.com/about/blog/2017/5/introducing-tableau-bridge-live-queries-premises-data-tableau-online-70767) Split bridges into two machines: one for extract refreshes and another for live queries, unless usage is extremely low Run the bridge continuously (ideally on a VM in a data center) Tune dashboards and queries to leverage extracts for summarized data Smart Table and Join Recommendations – Machine Learning will recommend tables and joins (even on non-similar names) based on previous usage metrics PDF Connector Connect to PDFs, identify tables, and pull data out Less copying/pasting/massaging of data to get it ready for Tableau Somewhat limited at this time, but continuing to be developed More Union support in more connectors DB2 Hadoop Teradata Netezza New connectors Amazon Athena MongoDB BI OneDrive ServiceNow Dropbox JSON – scan entire file, not just a sample Automatic Query Caching – Tableau server can pre-cache queries in recent workbooks after an extract refresh to speed up performance on initial load. Miscellaneous More options in Web Authoring (drills, formats, changing displays) Story points navigator – more streamlined Mobile – Android improvements, banner to Tableau Mobile, universal linking that allows you to click and open in Tableau Mobile Tooltip selections – highlight data from tooltip links Latest date filter Distribute evenly Maps – French, Netherlands, Australian, and New Zealand updates Apply table calc filters to totals Custom subscriptions – days/hours, etc. APIs – various REST updates (tags on sources and views, switch sites, get sites list, etc.) Music is Deep Sky Blue by Graphiqs Groove Sources https://www.tableau.com/new-features/10.3 https://www.tableau.com/about/blog/2017/4/save-time-data-driven-alerts-tableau-103-67888
Disclaimer: The podcast and artwork embedded on this page are from For the Love of Data, which is the property of its owner and not affiliated with or endorsed by Listen Notes, Inc.