Quantcast
Channel: big data – DATA TO DECISIONS
Viewing all 35 articles
Browse latest View live

Teradata Amps Up Cloud And Consulting Offerings

0
0

Teradata is thinking outside of box sales with VMware, AWS and Azure deployment options and new solutions and consulting services.

At this week’s Teradata Partners Conference in Atlanta the company hit several important cloud milestones with its “Teradata Everywhere” and “Borderless Analytics” announcements. And in another sign that it’s evolving, Teradata also announced a range of analytic solutions supported by consulting services.

Teradata Everywhere is the ability to run the same database and workloads without alteration in multiple deployment environments. The choices include on-premises systems, VMWare-based private-cloud instances, Teradata’s Managed Cloud services and Teradata Database on public clouds including Amazon Web Services and, by year end, Microsoft Azure.

Teradata Amps Up Cloud & Consulting from Constellation Research on Vimeo.

The newest options here are Teradata on VMware and parallel processing support on Amazon Web Services. The Teradata Database on AWS offering was introduced earlier this year, but it was initially limited to single-node deployment. Now you can exploit the power of massively parallel processing on up to 32 nodes, and Teradata says it will keep raising the node ceiling. MPP deployment on Microsoft Azure is set for the fourth quarter.

MyPOV on Teradata Everywhere. Teradata pre-announced most of these offerings last year, but it’s good to see it following through on both MPP (without that, what’s the point of using Teradata?) and Microsoft Azure. What was surprising was seeing an Amazon Web Services exec keynoting at Partners. That’s a good sign that Teradata is truly embracing the cloud, but I think the VMware option will be even more popular than the public cloud options -– at least for existing customers. For now, VMware deployment is capped at eight virtual servers and 32 virtual nodes, but that’s big capacity and Teradata says server and node capacities will increase over time.

Software Supports ‘Borderless Analytics’

It’s all well and good to have multiple deployment options, but the key to hybrid success is flexible data-access, querying and systems management. The borderless concept is supported by the latest versions of Teradata QueryGrid, for accessing data across heterogeneous environments, and Teradata Unity, for automated workload orchestration across multiple Teradata systems.

QueryGrid already supported unified access and querying against Teradata, Teradata Aster, third-party relational databases, Hadoop, and analytic compute clusters. That support extends to cloud instances of these sources. QueryGrid brought together what were separate Teradata-to-Hadoop, Teradata-to-Oracle and Teradata-to-Aster connectors. A rewrite due out by year end will unify the underlying architecture so there will be single connectors for each target system instead of multiple connectors. The new version will deliver better and more consistent query performance, and, according to Teradata, better support for security, encryption and performance monitoring across all sources.

Teradata Unity software automates workload distribution to ensure high availability and query performance within service-level demands. The latest release extends that workload balancing capability across on-prem, private, managed and public cloud Teradata instances. A Unity upgrade expected in the first half of 2017 will capture data changes on one Teradata system and automatically copy them to other Teradata systems. This will support cloud-based backup and disaster recovery use cases, for example, by automatically keeping on-premises and cloud-based systems in sync.

MyPOV on Borderless Analytics. QueryGrid is very popular among Teradata customers. Most connections are bidirectional, and it can push down queries into source systems including Hadoop to reduce overall query times. Unity is an extra-cost option, but the automation and workload distribution capabilities are hugely helpful when running multiple Teradata systems. As companies tap cloud instances, they’ll use Unity to support bursting scenarios wherein they seamlessly shift spikey or low-priority workloads into the cloud to better handle peak workloads and meet service-level agreements.

Teradata As Solutions Provider

Teradata got its start by selling to the business, not IT. In his keynote at Partners, Teradata CEO Victor Lund, who was appointed this spring after Mike Koehler was ousted, admitted that the company lost its way in part because it lost sight of selling to business needs. About half of Teradata’s revenue is already tied to solutions and consulting, but that ratio may grow given the Partners announcement of new analytic solutions, methodologies and accelerators backed by Teradata consulting.

Customer Journey Analytic Solution: This offering blends Teradata’s Real-Time Interaction Manager, Customer Interaction Manager and Teradata Aster ensemble analytics to track end-to-end customer paths across channels (email, online, in-store, call center, etc.). It then delivers recommended next-best actions and offers based on historical as well as  in-the-moment behaviors.

The Customer Journey Analytic solution is differentiated from similar-sounding offerings in that it addresses on-premises and call-center interaction as well as digital channels, says Teradata. And by incorporating real-time context, you’ll avoid pushing offers when someone just received that offer in a different channel, just purchased or is trying to resolve a service problem and is in no mood to purchase.

teradata-customer-journey-analytic-solution

RACE Services and Business Value Frameworks: RACE is a Rapid Analytic Consulting Engagement. The first step is aligning with the customer around Business Value Frameworks that provide starting points for high-value use cases. The prebuilt  Frameworks define hundreds of analytic use cases, according to Teradata, covering domains including customer and marketing, supply chain, product, operations, and finance and risk. Example use cases include Customer Satisfaction Index and Communications Compliance.

Analytics of Things Accelerators: Based on proven engagements with large industrial companies, these accelerators combine professional services with prebuilt starting-point content including data models, data transformations, analytic models, data visualizations and KPIs. The idea is to speed and take risk out of IoT projects. The first four accelerators are: Condition-Based Maintenance (think predictive maintenance and parts ordering); Manufacturing Performance Optimization (think maximizing equipment uptime); Sensor Data Qualification Accelerator (to determine which sensor data to clean up, filter out and keep); and Visual Anomaly Prospect Accelerator (for detecting actionable patterns in data).

MyPOV on Teradata’s Solutions Focus. Teradata is understandably putting an even bigger emphasis on solutions and consulting given that revenue from on-premises systems is and will remain under pressure. Data warehouse optimization projects are accelerating that trend as companies shift workloads onto Hadoop or cloud options including Teradata’s own services. The shift is one reason Teradata in 2014 acquired ThinkBig Analytics, which specializes in Hadoop and open source services and consulting. In July the company upped the ante by acquiring London-based Big Data Partnership, a startup that provides big data solutions and training.

The open question for Teradata is whether it can grow the pipeline of solutions and consulting engagements even as the pace of on-premises deployments and upgrades declines. New hardware purchases have historically triggered such engagements, so Teradata will have to find new ways to get its foot in the door. Another competitive threat is systems integrators that have typically been Teradata partners but that are ramping up analytics practices of their own. Teradata’s enviable list of existing customers and new cloud engagements are obvious places to look for solution and consulting opportunities.

What was clear at the Partners Conference is that market forces and this year’s leadership change have sparked both new thinking and a back-to-basics focus on business value. That’s leading to innovation and lots of new deployment and solutions options aimed at fast and flexible deployment and delivering value to the business.

Related Reading:
Teradata Disrupts Self With Cloud Push
SAP Reportedly Buying Altiscale to Power Big Data Services
Democratize the Data Lake: Make Big Data Accessible



MapR Ambition: Next-Generation Application Platform

0
0

MapR promises a more scaleable, reliable, real-time-capable and converged alternative to Hadoop, NoSQL databases and Kafka combined. Are companies buying it?

MapR is frequently mentioned in the same breath with Hadoop vendors Cloudera and Hortonworks, but maybe it’s time to stop thinking of them as competitors. Indeed, over the last eighteen months, MapR has added ambitious NoSQL database and streaming capabilities to what the company now calls its MapR Converged Data Platform.

The differences between MapR and its erstwhile competitors were underscored at MapR’s first ever analyst day, December 13, at its headquarters in San Jose, CA. Executives not only contrasted MapR’s platform with Hadoop, it also detailed advantages verses NoSQL databases Cassandra, HBase and MongoDB, and an open source staple of streaming applications, Apache Kafka. They bemoaned the “complexity” and “chaos” of multi-project open-source deployments, and MapR CEO Matt Mills, a 20-year Oracle veteran, proudly declared MapR to be “a commercial enterprise software company.”

@MapR, #MapR16

MapR presents its Converged data platform as a more scalable, reliable and performant alternative to Hadoop, NoSQL databases and Kafka combined.

It’s not that MapR doesn’t exploit open source innovation. The MapR platform includes components of Hadoop and Spark as well as Drill and Myriad, the last two being projects incubated by MapR and contributed to open source. The platform also relies entirely on industry-standard and open source APIs (a choice the company asserts eliminates the possibility of lock-in), even when MapR has replaced the associated components.

MapR chose from its founding to replace the Hadoop Distributed File System (HDFS) with a POSIX/NFS standard file system, for example, yet developers can still use the HDFS API. The POSIX/NFS choice provided read/write capabilities (verse append-only HDFS), better performance, and a “volumes” data construct for higher scalability and easier data organization and governance.

The early POSIX/NFS choice is now paying dividends as MapR goes after database and streaming rolls. The underpinning technology gives the MapR-DB database consistency, reliability and scalability advantages over HBase, Cassandra and MongoDB, says the company, yet developers can still use the HBase API. And given the breadth of capabilities across the platform (including MapR-DB), MapR cites scalability, data persistence, performance and global deployment advantages over Kafka and complex Lambda architectures (yet developers can use the Kafka API).

MapR hasn’t brought together all these capabilities just to check more boxes. Executives said they’re seeing more and more customers building out next-generation applications. The hallmark of such applications is compound requirements spanning the capabilities of file systems, search, databases and streaming systems. Another trait is the embedding of analytics directly into operational applications to support automated, data-driven actions without human intervention. MapR says its converged platform supports all of these demands with better speed, scale and reliability than you can cobble together with multiple open-source point systems.

MapR shared plenty of examples of customers building out next-gen apps. A Paysafe executive was there to talk about how it detects potentially fraudulent payment transactions within milliseconds so it can stop them before they go through. Rubicon runs a real-time, high-scale online ad exchange that handles peak loads of 5 million queries per second with 300 real-time decisions per ad placement. National Oilwell Varco analyzes sensor data from its oil well drills in real time to optimize production output and support predictive maintenance. And Qualcomm monitors sensors in its semiconductor plants in real time to automate actions that improve manufacturing yields.

The typical MapR customer is experienced with big data deployments, and more than 40% are former Cloudera or Hortonworks customers, according to the company. Given MapR’s commercial approach and emphasis on sophisticated requirements, it’s not the right choice for a big data newby or an open-source zealot. Partner Gustavo De Leon of Cognizant described would-be MapR customers as falling into the second of two classes of big data practitioners he’s seeing. First, there are the companies doing lots of big data proof-of-concept (POC) projects and not being terribly productive. Second, there are the companies that are more business focused that a concentrating on specific use cases.

De Leon’s implication was that MapR customers “want to know that they can take POCs into production and that the application will be enterprise ready and capable when they’re done.”

MyPOV on MapR Converged Data Platform

MapR’s foray into NoSQL and steaming opportunities is ambitions but the vision to serve converging requirements and high-performance demands isn’t new to the company. It has been the company’s focus and direction for years. What was new at the analyst day was hearing the vision directly from top brass along with forward-looking statements about the roadmap, investment plans and a possible future initial public offering. What was somewhat surprising was hearing quite the degree of open-source bashing, though I am hearing growing impatience from big data practitioners about the complexity of deploying and managing dozens of separate open source projects.

It was a good first-time analyst event for MapR, but the company was a bit stingy with company measures and plans. The roadmap was more like a set of themes with no precise dates attached. I also would have liked to hear from more customers, including non-OEM customers who don’t have an interest in promoting their own business. MapR has a solid list of high-profile customers, but it’s understandably hard to get an executive from an American Express, Audi, Novartis or United Healthcare to come speak at a tiny insider event in mid December.

Given MapR’s comparatively small size (which it doesn’t disclose but is likely somewhere between $100 million and $200 million), I would have liked to have heard a more nuanced, flexible positioning in the “land-and-expand” or “we can work with incumbent tools or replace them” vein. Instead we heard the hard-sell “we can do it all and do it better than all those other [popular and widely used] tools out there.” I’m guessing that in real-world sales situations there are plenty of developers and influencers predisposed to popular open source choices. I’m also guessing MapR has an easier time making a case for its converged story once it’s established inside a company. And no doubt it gets the nod first as a big data analytics platform, and not as a stand-alone NoSQL database or streaming choice.

I completely agree with MapR that people have to stop thinking of analytics only as reports, data visualizations and other types of human interactions and start thinking more about embedding analytics into transactional applications as automated triggers and actions. At the very least it should be alerts for exception conditions. As companies move toward these sorts of sophisticated, next-gen applications, MapR will have a better and better shot at being part of the conversation.


Spark Gets Faster for Streaming Analytics

0
0

Spark Summit East highlights progress on machine learning, deep learning and continuous applications combining batch and streaming workloads.

Despite challenges including a new location and a nasty Nor’easter that put a crimp on travel, Spark Summit East managed to draw more than 1,500 attendees to its February 7-9 run at the John B. Hynes Convention Center in Boston. It was the latest testament to growing adoption of Apache Spark, and the event underscored promising developments in areas including machine learning, deep learning and streaming applications.

The Summit had outgrown last year’s east coast home at the New York Hilton, but the contrast between those cramped quarters and the cavernous Hynes made comparison difficult. As I wrote of last year’s event, the audience was technical, and if anything, this year’s agenda seemed more how-to than visionary. There were fewer keynotes from big enterprise adopters and more from vendors.

spark-progress-2016

Mataei Zaharia of Databricks recapped Spark progress last year, highlighting growing adoption and performance improvements in areas including streaming data analysis.

The Summit saw plenty of mainstream talks on SQL and machine learning best practices as well as more niche topics, such “Spark for Scalable Metagenomics Analysis” and “Analysis Andromeda Galaxy Data Using Spark.” Standout big-picture keynotes included the following:

Mataei Zaharia, the founder of Spark and chief technology officer at Databricks, gave an overview of recent progress and coming developments in the open source project. The centerpiece of Zaharia’s talk concerned maturing support for continuous applications requiring simultaneous analysis of both historical and streaming, real-time information. One of the many use cases is fraud analysis, where you need to continuously compare the latest, streaming information with historical patterns in order to detect abnormal activity and reject possibly fraudulent transactions in real time.

Spark already addressed fast batch analytics, but support for streaming was previously limited to micro-batch (meaning up to seconds of latency) until last February’s Spark 2.0 release. Zaharia said even more progress was made with December’s Spark 2.1 release with advances on Structured Streaming, a new, high-level API that addresses both batch and stream querying. Viacom, an early beta customer, is using Structured Streaming to analyze viewership of cable channels including MTV and Comedy Central in real time while iPass is using it to continuously monitor WiFi network performance and security.

Alexis Roos, a senior engineering manager at Salesforce, detailed the role of Spark in powering the machine learning, natural language processing and deep learning behind emerging Salesforce Einstein capabilities. Addressing the future of artificial intelligence on Spark, Ziya Ma, a VP of Big Data Technologies at Intel, offered a keynote on “Accelerating Machine Learning and Deep Learning at Scale with Apache Spark.” James Kobielus of IBM does a good job of recapping Deep Learning progress on Spark in this blog.

Ion Stoica, executive chairman of Databricks, picked up where Zaharia left off on streaming, detailing the efforts of UC Berkeley’s RISELab, the successor of AMPLab, to advance real-time analytics. Stoica shared benchmark performance data showing advances promised by Apache Drizzle, a new streaming execution engine for Spark, in comparison with Spark without Drizzle and streaming-oriented rival Apache Flink.

Stoica stressed the time- and cost-saving advantages of using a single API, the same execution engine and the same query optimizations to address both streaming and batch workloads. In a conversation after his keynote, Stoica told me Drizzle will likely debut in Databricks’ cloud-based Spark environment within a matter of weeks and he predicted that it will show up in Apache Spark software as soon as the third quarter of this year.

#SparkSummit

The Apache Drizzle execution engine being developed by RISELabs promises better streaming query performance as compared to today’s Spark or Apache Flink.

MyPOV of Spark Progress

Databricks is still measuring Spark success in terms of number of contributors and number of Spark Meetup participants (the latter count is 300,000-plus, according to Zaharia), but to my mind, it’s time to start measuring success by mainstream enterprise adoption. That’s why I was a bit disappointed that the Summit’s list of presenters in the CapitalOne, Comcast, Verizon and Walmart Labs mold was far shorter than the list of vendors and Internet giants like Facebook and Netflix presenting.

Databricks says it now has somewhere north of 500 organizations using its hosted Spark Service, but I suspect the bulk of mainstream Spark adoption is now being driven by the likes of Amazon (first and foremost) as well as IBM, Google, Microsoft and others now offering cloud-based Spark services. A key appeal of these sources of Spark is the availability of infrastructure and developer services as well as broader analytical capabilities beyond Spark. Meanwhile, as recently as last summer I heard Cloudera executives assert that the company’s software distribution was behind more Spark adoption than that of any other vendor.

In a though-provoking keynote on “Virtualizing Analytics,” Arsalan Tavakoli, Databricks’ VP of customer engagement, dismissed Hadoop-based data lakes as a “second-generation” solution challenged by disparate and complex tools and access limited to big data developer types. But Tavakoli also acknowledged that Spark is only “part of the answer” to delivering a “new paradigm” that decouples compute and storage, provides uniform data management and security, unifies analytics and supports broad collaboration among many users.

Indeed, it was telling when Zaharia noted that 95% of Spark users employ SQL in addition to whatever else they’re doing with the project. That tells me that Spark SQL is important, but it also tells me that as appealing as Spark’s broad analytical capabilities and in-memory performance may be, it’s still just part of the total analytics picture. Developers, data scientists and data engineers that use Spark are also using non-Spark options ranging from the prosaic, like databases and database services and Hive, to the cutting edge, such as emerging GPU- and high-performance-computing-based options.

As influential, widely adopted, widely supported and widely available as Spark may now be, organizations have a wide range of cost, latency, ease-of-development, ease-of-use and technology maturity considerations that don’t always point to Spark. At least one presentation at Spark Summit cautioned attendees not to think of Spark Streaming, for example, as a panacea for next-generation continuous applications.

Spark is today where Hadoop was in 2010, as measured by age, but I would argue that it’s progressing more quickly and promises wider hands-on use by developers and data scientists than that earlier disruptive platform.


Cloudera Focuses Message, Takes Fifth On Pending Moves

0
0

Cloudera executives can’t talk about IPO or cloud-services rumors. Here what’s on the record from the Cloudera Analyst Conference.

There were a few elephants in the room at the March 21-22 Cloudera Analyst Conference in San Francisco. But between a blanket “no comment” about IPO rumors and non-disclosure demands around cloud plans — even whether such plans exist, or not — Cloudera execs managed to dance around two of those elephants.

The third elephant was, of course, Hadoop, which seems to be going through the proverbial trough of disillusionment. Some are stoking fear, uncertainty and doubt about the future of Hadoop. Signs of the herd shifting the focus off Hadoop include Cloudera and O’Reilly changing the name of Strata + Hadoop World to Strata Data. Even open-source zealot Hortonworks has rebranded its Hadoop Summit as  DataWorks Summit, reflecting that company’s diversification into streaming data with its Apache NiFI-based Hortonworks DataFlow platform.

@Cloudera #ClouderaAC #analytics

Mike Olson, Cloudera’s chief strategy officer, positions the company as a major vendor of enterprise data platforms based on open-source innovation.

At the Cloudera Analyst Conference, Chief Strategy Officer Mike Olson said that he couldn’t wait for the day when people would stop describing his company as “a Hadoop software distributor” mentioned in the same breath with Hortonworks and MapR. Instead, Olson positioned the company as a major vendor of enterprise data platforms based on open-source innovation.

MapReduce (which is facing away), HDFS and other Hadoop components are outnumbered by other next-generation, open-source data management technologies, Olson said, and he noted that there are some customers who are just using Cloudera’s distributed and supported Apache Spark on top of Amazon S3, without using any components of Hadoop.

Cloudera has recast its messaging accordingly. Where years ago the company’s platform diagrams detailed the many open source components inside (currently about 26), Cloudera now presents a simplified diagram of three use-case-focused deployment options (shown below), all of which are built on the same “unified” platform.

Cloudera Deployment PackagesCloudera-developed Apache Impala is a centerpiece of the Analytic DB offering, and it competes with everything from Netezza and Greenplum to cloud-only high-scale analytic databases like Amazon Redshift and Snowflake. HBase is the centerpiece of the Operational DB offering, a high-scale alternative to DB2 and Oracle Database on the one hand and Cassandra, MapR and MemSQL on the other. The Data Science & Engineering option handles data transformation at scale as well as advanced, predictive analysis and machine learning.

Many companies start out with these lower-cost, focused deployment options, which were introduced last year. But 70% to 75% percent of customers opt for Cloudera’s all-inclusive Enterprise Data Hub license, according to CEO Tom Reilly. You can expect that when Cloudera introduces its own cloud services, it will offer focused deployment options that can be launched, quickly scaled and just as quickly turned off, taking advantage of cloud economies and elasticity.

Navigating around the non-disclosure requests, here are a few illuminating factoids and updates from the analyst conference:

Cloudera Data Science Workbench: Announced March 14, this offering for data scientists brings Cloudera into the analytic tools market, expanding its addressable market but also setting up competition with the likes of IBM, Databricks, Domino Data, Alpine Data Labs, Dataiku and  a bit of coopetition with partners like SAS. Based on last year’s Sense acquisition, Data Science Workbench will enable data scientists to use R, Python and Scala with open source frameworks and libraries while directly and securely accessing data on Hadoop clusters with Spark and Impala. IT provides access to the data within the confines of Hadoop security, including Kerberos.

Apache Kudu: Made generally available in January, this Cloudera-developed columnar, relational data store provides real-time update capabilities not supported by the Hadoop Distributed File System. Kudu went through extensive beta use with customers, and Cloudera says it’s seeing an equal split of deployment in conjunction with Spark, for streaming data applications, and with Impala, for SQL-centric analysis and real-time dashboard monitoring scenarios.

Business update: CEO Tom Reilly said the company now has more than 1,000 customers, with at least half being large, Global 8,000 companies (the company’s primary target). This includes seven of the top-ten banks and nine of the top-ten telecommunications companies. The company now has 1,600 employees, up from 1,200 last year.

MyTake On Cloudera Positioning and Moves

Yes, there’s much more to Cloudera’s platform than Hadoop, but given that the vast majority of customers store their data in what can only be described as Hadoop clusters, I expect the association to stick. Nonetheless, I don’t see any reason to demure about selling Hadoop. Cloudera isn’t saying a word about business results these days — likely because of the rumored IPO. But consider the erstwhile competitors. In February Hortonworks, which has been public for two years, reported a 39% increase in fourth-quarter revenue and a 51% increase on full-year revenue (setting aside the topic of profitability). MapR, which is private, last year claimed (at a December analyst event) an even higher growth rate than Hortonworks.

Assuming Cloudera is seeing similar results, it’s experiencing far healthier growth than any of the traditional data-management vendors. Whether you call it Hadoop and Spark or use a markety euphemism like next-generation data platform, the upside customers want is open source innovation, distributed scalability and lower cost than traditional commercial software.

As for the complexity of deploying and running such a platform on premises, there’s no getting around the fact that it’s challenging – despite all the things that Cloudera does to knit together all those open-source components. I see the latest additions to the distribution, Kudu and the Data Science Workbench, as very positive developments that add yet more utility and value to the platform. But they also contribute to total system complexity and sprawl. We don’t seem to be seeing any components being deprecated to simplify the total platform.

Deploying Cloudera’s software in the cloud at least gives you agility and infrastructure flexibility. That’s the big reason why cloud deployment is the fastest-growing part of Cloudera’s business. If and when Cloudera starts offering its own cloud services, it would be able to offer hybrid deployment options that cloud-only providers, like Amazon (EMR) and Google (DataProc) can’t offer. And almost every software vendor embracing the cloud path also talks up cross-cloud support and avoidance of lock-in as differentiators compared to cloud-only options.

I have no doubt that Cloudera can live up to its name and succeed in the cloud. But as we’ve also seen many times, the shift to the cloud can be disruptive to a company’s on-premises offerings. I suspect that’s why we’re currently seeing introductions like the Data Science Workbench. It’s a safe bet. If and when Cloudera truly goes cloud, and if and when it becomes a public company, things will change and change quickly.


SAS Takes Next Steps to Cloud Analytics

0
0

SAS Viya is now available as the cloud-friendly platform for SAS Visual apps and, soon, SAS 9. Next up should be more cloud-based services options.

SAS, like many well-established tech vendors, has to keep one eye on the future and one eye on the past. At the April 2-5 SAS Global Forum in Orlando, FL, the company did its best to reassure the 5,500-plus attendees that it can take them into the future without obsoleting past investments in SAS technologies and training.

To the tens of thousands of companies running SAS 9 (there are some 77,000 site licenses for the software), the message was “SAS 9 is here to stay.” And to the thousands of customers running SAS’s newer Visual products (there are more than 6,000 site licenses for SAS Visual Analytics and north of 1,800 for SAS Visual Statistics), the message was “everything in the portfolio can now run on SAS Viya.”

@SAS, #SASGF

SAS Viya is the lynchpin of the company’s future. Introduced at SAS Global Forum 2016, Viya is the company’s virtualization and container-ready, Hadoop-compatible, next-generation back-end architecture. It supports in-memory, distributed processing at scale as well as lifecycle management for data and models. Microservices and REST APIs support services-oriented embedding of analytic services. Viya also extends language support beyond SAS to Python, Lua and, coming, R, Java and Scala.

At SAS Global Forum the company announced that the entire Visual suite — Visual Analytics, Visual Statistics, Visual Data Mining & Machine Learning, Visual Investigator, Visual Forecasting, Optimization and Econometrics – can now run on Viya. Heretofore these products ran on the SAS LASR Server, which will continue to be available and supported. But if you want the combination of scalability, virtualization and multi-cloud, container-based portability and flexibility, you’ll want Viya.

As for SAS 9, connectivity to Viya will be introduced in the third quarter, opening up big-data and machine-learning capabilities. These jobs will be sent to Viya while more routine workloads and procedures will continue to run on SAS 9.

“If you’re processing all your jobs in less than one or two minutes, you’re fine and you don’t need to move [to Viya],” said SAS co-founder and CEO Jim Goodnight during a keynote discussion. If you need big-data or modern machine learning capabilities, “…think of Viya as an extension of SAS 9.”

The plan is to add the more routine analytical capabilities to Viya over time, but some analyses and workloads, including mainframe (zOS) and AIX workloads, will remain tied to SAS 9 (currently on release 9.4).

Other important announcements at SAS Global Forum included:

SAS Visual Investigator 10.2: This analytic application was released in 2016, but 10.2 update offers improved search and discovery, scorecarding, alerting, entity analytics, workflow and administrative capabilities. SAS is also developing and delivering pre-built content for more than 14 investigative use cases, including child welfare, anti-money-laundering, power and energy monitoring, insider threat detection, and prescription drug abuse.

Current Expected Credit Loss (CECL): This SAS content helps banks deal with what’s formally known as ASU 2016-13, a new U.S. GAAP standard for credit loss accounting. CECL replaces today’s “incurred loss” approach and will become effective for SEC filers in 2020. The rules establish a lifetime loss for every loan, require point-in-time loss estimates and increased public disclosure requirements.

SAS-Cisco IoT Partnership: The companies have been working together for 18 months to create the Cisco SAS Edge-to-Enterprise IoT Analytics Platform. The platform includes SAS Event Stream Processing, now certified to run on CISCO UCS servers, and was launched with proof-of-concept content for the energy and mining industries.

SAS Results-as-a-Service: A combination of strategy and consulting services wherein SAS professionals deliver analytical solutions within weeks or months. Once a solution is approved, SAS can deploy it in the cloud or on-premises, with supporting managed services or, if desired, training and hand offs to customer teams. The service is aimed at companies that don’t have available staff or infrastructure to tackle new analytic challenges.

@SAS, #SASGF #Analytics

The SAS Visual portfolio can now run on the Viya architecture. I’m hoping to see software-as-a-service options that would appeal to new customers.

MyTake on SAS Global Forum Announcements

As I commented after last year’s Global Forum, Viya is SAS’s modern architecture as well as its answer to multiple open-source and cloud threats. The biggest threat by far is Apache Spark, which is gaining adoption quickly and is now widely available as a service on multiple public clouds. Spark software is also distributed and supported for on-premises deployment by multiple vendors, including IBM and the big-three Hadoop vendors, Cloudera, Hortonworks and MapR.

Many SAS customers are, indeed, very conservative and more concerned about continued SAS 9 support than big-data analysis or cloud-deployment options. But with an eye to the future, SAS Viya is crucial. It can’t get here soon enough, in my book, because Apache Spark, as a scalable, in-memory platform and as an elastic, pay-for-what-you-use cloud service, has been steadily gathering steam.

In an one-on-one conversation, SAS Executive VP and CTO Oliver Schabenberger asserted that open-source alternatives lack the data-governance and model-management capabilities that many SAS customers, particularly regulated companies, insist upon. That may be, but dark warnings about governance failed to keep many BI practitioners from embracing self-service data-discovery and data-visualization products — even before those products gained governance capabilities.

On the topic of open source machine learning and algorithms, Schabenberger said “SAS won’t take a back seat to anybody on analytics.” But there’s no denying that the cost advantages and ecosystem strengths of the open-source model are driving huge adoption. That’s precisely why SAS opened up Viya to Python. SAS has had a lot to say, in recent years, about making free SAS software available to colleges and universities. But to a fault, customers in the sessions I attended at the Forum said their new hires tend to use open-source languages such as Python.

If SAS governance and lifecycle-management capabilities and its analytic depth and breadth are truly superior, they’ll stand up to open-source competition. It’s clearly a topic of debate within SAS. Some executives pointed out that it’s long been possible to invoke open-source algorithms from within SAS, though they’d like to see more explicit support for Spark and other emerging options. And now that Viya is here, several execs hinted that SAS will be much more aggressive about offering SAS analytics and software as ready-to-run cloud services. These would be welcome, future-minded steps that might attract a next-generation of SAS customers, but they’re not on the official roadmap for now. I’m hoping to see more openness and more cloud services options at next year’s SAS Global Forum.


Teradata Transition to Cloud and Consulting Continues

0
0

Teradata simplifies pricing, executes on business consulting and hybrid cloud strategy. A look at next steps in the company’s ongoing transition.

“Business outcome led, technology enabled.” This was the theme at the May 8-10 Teradata Third-Party Influencers Summit in San Diego, and it reflected a two-to-one ratio of consulting-oriented presentations to technology updates.

Teradata has been expanding already robust consulting and implementation offerings in part because mass migrations to cloud computing and open-source big data platforms like Hadoop have reduced demand for Teradata’s on-premises racks and appliances for data warehousing. Even as data volumes have continued to grow exponentially, Teradata’s revenues have declined in recent years from a high of $2.7 billion in 2014 to $2.3 billion in 2016.

@Teradata, #TD3PI

Teradata compared it’s old (at left) and new (at right) pricing scheme and cloud managed services options at its May Influencers Summit.

Last year’s Summit was held shortly after the company replaced its CEO, announced plans to sell off its Aprimo marketing business unit, and introduced a more aggressive path to cloud and consulting services. At this year’s Summit we learned that Teradata has not only executed on that strategy, it has gone further to transform itself by pursuing simplicity, flexibility and control in four areas:

Pricing: Responding to feedback that its licensing approach was too complex, with too many licensing models and too many a la carte options, Teradata has devised a consistent, subscription-based licensing approach that will apply on-premises or in private or public clouds. The model is based on two dimensions: T Cores and Tiers. T Cores measure compute cores and disk I/O, but there are discounts if you’re using less than maximum input/output capacity.

The four-tiers reflect how capacity is being used, ranging from the free Developer tier to progressively more feature-rich (and most costly) Base (simple production), Advanced (production with mixed workloads), and Enterprise (mission-critical, enterprise workloads) tiers. The pricing is designed to be simple, predictable and consistent, with no penalties for choosing or moving between on-premises, private cloud or public cloud deployment. What’s more, pricing is more aggressive, with the Base Tier taking on cloud rivals like Amazon Redshift.

Portfolio: Where Teradata previously offered as many as nine systems in its portfolio, in now offers just two. IntelliFlex, the company’s new flagship, separates storage and compute decisions to support multiple workloads within a single rack. Customers can add different types of storage and compute nodes, ranging from archival retrieval to the ultimate in in-memory query performance. Customers can also add capacity in smaller increments than previously available and they can quickly reconfigure as needs change.

IntelliBase is Teradata’s entry-level appliance. It costs approximately 15% more than commodity hardware. IntelliBase is designed for more balanced data warehouse workloads. IntelliBase is designed for more balanced data warehouse workloads. It is not as flexible as IntelliFlex, which can be reconfigured to address high I/O or high CPU requirements..

Cloud: Teradata has made over and recast its managed cloud services as IntelliCloud. The offering combines the new T-Core- and Tier-based pricing scheme with three flexible infrastructure options behind the cloud services. Teradata previously offered only appliance-grade (2800 series) capacity behind its services, but you can now choose IntelliFlex or IntelliBase as the platform for managed services in the Teradata Cloud, which has data centers in Las Vegas and Frankfurt. The third option is Teradata-managed services running on carefully selected infrastructure services in the Amazon cloud (and, later this year, the Microsoft’s Azure cloud). Consumption options are more elastic with the public cloud options, but it won’t be as performant as IntelliFlex-based capacity, and service-level agreements aren’t available because Teradata has no control over the infrastructure. The intent it to give customers choice, with a fourth choice being bring-your-own-license and managing Teradata Database on AWS or Azure yourself.

Consulting: Teradata has consolidated its growing consulting offerings under the Teradata Global Services umbrella, and it has formalized three service lines to avoid overlaps and confusion. Think Big Analytics, the big data consulting business Teradata acquired in 2014, continues as the business-outcome-focused unit, offering industry-focused expertise in data science, data visualization and big data solutions. Enterprise Data Consulting focuses on technology, offering expertise in architecture, data management, data governance, security and services. Customer Services helps customers get the most out of their systems and people, applying proactive and reactive expertise in systems and software management and change management.

MyPOV on Teradata’s Ongoing Transformation

Disruptive market forces have dealt Teradata a tough hand to play. There’s clearly disillusionment with complex open source platforms like Hadoop these days, but that doesn’t mean we’re going back to Teradata’s heyday of enterprise data warehousing. Companies are still pursuing high-scale data lake approaches on low-cost, distributed platforms, whatever flavor prevails (whether that’s HDFS, objects stores like S3 and Azure Data Lake, or the next open source fad). Companies will also continue to rationalize their comparatively high-cost data warehousing infrastructure expenditures.

Teradata acknowledged last year that we’re in a “post-relational world,” but this year’s Summit shows signs that it’s truly adapting to a changed market. The company has not only delivered far more flexible hardware, it has gone further with the simplified, hybrid subscription-based pricing and more flexible cloud-deployment options.

Teradata is becoming more of a software and services company and less of a hardware vendor. That shift should eventually improve profitability, even if revenues continue to slide as deployments shift to the cloud.

Will customers trust Teradata to provide impartial, “business outcome led” consulting services? Customer Gerhard Kress said he chose Teradata in 2013 in large part because “the company understands that the world is a lot bigger than Teradata.” Director, Data Services at Siemens, Kress presented at the Summit on the train manufacturer’s global IoT deployment, and he noted that other vendors (mostly big platform vendors) asserted that they could address all challenges within their stacks. Teradata, meanwhile, suggested a heterogeneous approach reflecting technologies already in place at Siemens.

@Teradata, #TD3PI, #Analytics

This “Blended Architecture” slide, from a Teradata Ecosystem Architecture Services presentation, captures the vendor’s realistic sense of its place within enterprise environments.

Teradata has also become more realistic about its cloud ambitions. Two years ago Teradata talked about pursuing midsize businesses with the Teradata Database on AWS service. At this year’s Summit Teradata said it’s no longer pursuing that idea. Instead, executives said the company is focused on the needs of the 500 highest-scale and most sophisticated customers. That’s where Teradata’s technology really shines.

Teradata thrived in the past when it focused on delivering data-driven business outcomes at top of the market. It appears that focus is back.


Qlik Plots Course to Big Data, Cloud and ‘AI’ Innovation

0
0

Qlik highlights upgrades and the roadmap to high-scale, hybrid cloud and ‘augmented intelligence.’ Here’s my take on the long-range plans.

Big data scalability, hybrid cloud flexibility and smart “augmented” intelligence. These are the three plans that business intelligence and analytics vendor Qlik officially put on its roadmap at the May 15-18 Qonnections conference in Orlando, Florida.

Qlik also highlighted six important upgrades coming in the Qlik Sense June 2017 release – one of five annual updates now planned for the company’s flagship product (reflecting cloud-first pacing, though on-premises customers can choose whether and when to make the move). The June upgrade highlights include:

  • Self-service data-prep capabilities
  • New data visualizations and color-selection flexibility
  • Qlik GeoAnalytics geospatial analyses added through the vendor’s January acquisition of Idevio
  • An improved QlikSense Mobile app that supports offline analysis
  • Support for advanced analytics capabilities based on R and Python
  • Easier conversion of QlikView apps to Qlik Sense.
Qlik, analytics, BusinessIntelligence

Qlik is promising “augmented intelligence” said to combine the best of machine intelligence with human interaction and decisions.

Most of these upgrades earned hearty applause from the more than 3,200 attendees at the Qonnections opening general session, but the sexiest and most visionary announcements were the ones on the roadmap. Here’s a rundown of what to expect, along with my take on what’s coming.

Building Toward BigData Analysis

Qlik’s key differentiator is its associative QIX data-analysis engine, which is at the heart of the company’s platform and is shared by its Qlik Sense and QlikView applications. QIX keeps the entire data set and rich detail visible even as you focus in on selected dimensions of data. If you select customers who are buying X product, for example, you’ll also see which customers are not buying that product. It’s an advantage over drill-down analysis where you filter out information as you explore

There have been limits, however, in how much data you can analyze within the 64-bit, in-memory QIX engine. Qlik has a workaround whereby you start with aggregated views of large data sets. Using an On-Demand App Generation capability you can then drill down to the detailed data in areas of interest. But the drawback of this approach is that you lose the powerful associative view of non-selected data.

The Associative Big Data Index approach announced at Qonnections will create index summaries of large data sets, drawn from sources such as Hadoop or high-scale distributed databases. A distributed version of the QIX engine will then enable users to explore the fine-grained detail within slices of the data without losing sight of the summary-level index of the entire data set.

MyPOV on Qlik big data capabilities: What I like about the Associative Big Data Index is that it will leave data in place, whether that’s in the cloud or in an on-premises big data source. It brings the query power to the data, eliminating time-consuming and costly data movement. The distributed architecture also promises performance. In a demo, Qlik demonstrated nearly instantaneous querying of a 4.5-terabyte data set. Granted, it was a controlled, prototype test, so we’ll have to wait and see about real-world performance.

Speaking of waiting, on big data, as on the hybrid cloud and augmented intelligence fronts, Qlik senior vice president and CTO Anthony Deighton set conservative expectations, telling customers they would see progress by next year’s Qonnections event. He didn’t rule out the possibility of an earlier release, but nor did he promise that any of the new capabilities would be generally available by next year’s event. As has been Qlik’s habit in recent years, it’s responding slowly to demands in emerging areas like big data and cloud.

Preparing for Hybrid Cloud

The business intelligence market has forced a binary, either-or, on-premises or cloud-based choice, said Deighton. He vowed that Qlik will change it to an and/or choice by fostering hybrid flexibility with the aid of microservices, APIs and containerized deployment. The approach will also require sophisticated, federated identity management, which the vendor has developed to support European GDPG data security and privacy compliance requirements set to go into effect next year.

In a prototype preview at Qonnections, Qlik demonstrated workloads being spawned and assigned automatically across Qlik nodes running on Amazon, in the Qlik Cloud and on-premises. The idea is to flexibly send workloads to the most appropriate resources. That could mean spawning public cloud instances on the fly when scale is required. Or it could mean keeping analyses on-premises when regulated data is involved. Qlik is working with big banks and hospitals, among other customers, to master microservices orchestration across on-premises, private-cloud and public-cloud instances. 

MyPOV On Qlik’s cloud plans: As noted above, Qlik made no promises as to when it will deliver on this flexible, cloud-friendly microservices vision, other than to say that we’ll hear more at Qonnections 2018. Qlik’s cloud offerings need these workload-management features, particularly where Qlik Sense Enterprise in the cloud is concerned. Customers want better performance as well as the granular services and APIs they’re used to from leading SaaS vendors. I believe it’s more important for Qlik to deliver quickly on this front than on any other, so let’s hope it’s something introduced before Qonnections 2018.

Augmenting Intelligence

There have been many announcements about “smart” capabilities this year. A few have of the capabilities have actually launched (like those detailed in my detailed reports on Salesforce Einstein and Oracle Adaptive Intelligent Apps), but most are works in progress. Some are conservatively described as automated predictive analytics or machine learning while others are billed as “artificial intelligence.”

Over the past year, Deighton and other Qlik executives have charged that competitive AI and cognitive offerings tend to remove humans from decision making. In keeping with this theme, the company announced that it’s working on “augmented intelligence” that will “combine the best” of what machines can do with human input and interaction. The approach will eschew automation in favor of machine-human interaction that will bring context to data and promote better-informed machine learning, said Deighton.

The general idea is for humans to interact with concise lists of computer-generated suggestions. This will happen through computer-augmented interfaces at various stages in the data-analysis lifecycle. When users bring data together, for example, data-analysis algorithms will be applied to suggest how the data might be correlated. In the analysis stage, algorithms will suggest the best analytical approaches. And once results are generated, data-visualizations algorithms will be applied to suggests best-fit visualizations. Humans will interact with the suggestions and make the final selections at every stage. Deighton promised something that will neither dump too many possibilities on users, at one extreme, nor create “trust gaps” by automating and remove human input from decisioning.

MyPOV on Qlik Augmented Intelligence: Based on conversations with Qlik executives, I’d say we’re in the early stages of Qlik’s augmented intelligence initiative. It all sounds good, but the details were sketchy. I heard a bit about analytic libraries and potential partnerships on the machine learning and neural net front. But executives weren’t ready to name partners or predict availability. In short, we may see the beginnings of Qlik’s augmented intelligence capabilities at Qonnections 2018, but Qlik execs were up front in describing the initiative as something that may take a few years to mature.

Qlik’s most direct competitors, including Tableau, Microsoft, SAP and IBM, are all working on smart data exploration, basic prediction and “smart” recommendation features of one stripe or another. IBM is actually on the second-generation of its cloud-based IBM Watson Analytics service. Yet we’re still in the very earliest phases of bringing advanced analytics, machine learning and artificial intelligence to the broad business intelligence market. I think 2017 may mark the end of the beginning. By 2018 and beyond, we’ll start to see vendor selections based on smart features rather than the maturing trend toward self-service capabilities.

RELATED READING:
Qlik Gets Leaner, Meaner, Cloudier
Inside Salesforce Einstein Artificial Intelligence
Tableau Sets Stage For Bigger Analytics Deployments


Infor Advances Data Agenda With ‘Coleman’ AI, Birst BI Integration

0
0

Infor lays out plans for artificial intelligence and cloud-based business intelligence and analytics. Here’s what customers need to know.

Infor entered the increasingly crowded artificial intelligence (AI) arena July 11 by introducing its Coleman AI platform. Unveild at the company’s Inforum 2017 event in NY, Coleman was described as a language- and image-savvy AI platform that will automate rote tasks and augment human capabilities in a range of industry-specific use cases. Infor also used Inforum to detail its plans for Brist, the cloud-based business intelligence platform acquired in April.

@Infor #AI #artificialintelligence

Named for ’60-era NASA data scientist Katherine Coleman Johnson, one of three African American NASA mathematicians portrayed in the 2016 film “Hidden Figures,” Coleman builds on Infor assets including machine learning capabilities for retail acquired with Predictix and the work of the company’s Dynamic Science Labs. The labs have developed apps including a price optimization app for distributors and an inventory optimization app for healthcare providers.

Existing apps and machine learning capabilities are being folded into the Coleman portfolio, and Infor says it’s also well along in developing a unifying data platform for AI. Built on Amazon Web Services infrastructure, including an Amazon S3 data lake, the platform will aggregate petabytes of data available from Infor’s various industry-focused CloudSuites. The data will fuel AI-powered optimization, recommendation, and decision-automation services that will be delivered through Infor’s Ion API-integration platform.

Infor will use the speech recognition and natural language understanding capabilities of the Amazon Lex AI service to power conversational capabilities. This will enable Coleman to serve as a human assistant, answering questions such as, “Coleman, what are the payments outstanding for company X,” or “Coleman, forecast demand for product Y.” Conversational UIs will also support multi-faceted processes with simple requests such as, “Coleman, approve the promotion of Sara Jones.”

With image-recognition capabilities, Coleman will turn smart phone cameras into search tools. Take a picture of a product, for example, and Coleman will be able to identify the product and detail prices, specifications and availability details. Used internally, Coleman will tell workers how much inventory is available, what’s on order, when it will arrive and whether alternatives are available from other suppliers.

MyPOV on Coleman Vision vs. Reality

Infor certainly offered a compelling AI vision, but executives also acknowledged that it’s “early days” for Coleman. Data-pipeline development and modeling is further along in some industries than others, they acknowledged. Smart services for retail are likely to arrive first. As for the ingredients of Coleman, it’s a mixed collection of Infor and third-party IP. It remains to be seen just how it will all come together and when we’ll start seeing smart prediction, recommendation, optimization, and automation services beyond what existed before Coleman was announced.

I’m particularly eager to understand how and whether Infor uses generalized models based on aggregated CloudSuite data and then develops customer-specific recommendations and smart services. With its recent Leonardo launch, SAP talked about combining shared models and customer-specific models. Customers will also want to know whether it’s as simple as “turning Coleman on” from within a CloudSuite, as Infor executives suggested. We have also yet to learn just how much data will be required to generate reliable predictions, recommendations, optimization and automated actions.

Infor’s Birst Acquisition and Plans for Analytics

When Infor announced the acquisition of Birst in April, the first question for customers of the cloud-based BI company was, “will Birst disappear as an independent company?” The answer was an emphatic no. Birst operations continue unchanged under the direction of the existing management team. Birst founder and CEO, Brad Peters, continues as General Manager of what’s now known as “Birst, an Infor Company.”

The first question for Infor customers was, “how will Birst work with my existing Infor BI investments, including Cognos-based reporting tools and Infor BI cubes?” The company laid out detailed integration plans at Inforum.

In the first phase of the three-phase plan, currently underway, Infor customers can swap out like-for-like capabilities that they’ve licensed from Infor for newer, cloud-based capabilities from Birst. For example, customers using Infor’s Cognos-based reporting tools can switch to reporting tools available from Birst. They’ll also be able to add capabilities, at extra cost, that weren’t available from Infor, such as Birst’s ETL engine and self-service data-exploration and visualization capabilities.

Infor has already integrated Birst with Infor Ming.le collaboration and single-sign-on capabilities, but phase two, expected to be completed in September, will bring even deeper levels of integration. For example, customers using Infor BI will be able to use Birst as a cloud-based environment for analysis and reporting. This will enable Infor customers to leverage their Infor BI cubes while also blending in external data sources and taking advantage of Birst’s modern data-exploration and visualization capabilities. Integration is also underway between Infor’s Amazon S3- and Athena-based data lake environment and Birst. This will extend Birst’s big data analysis capabilities.

In phase three, Birst will add predictive capabilities from Infor’s Dynamic Science Labs unit. We’ll also see deeper integration of Birst-powered analytic services into the Infor XI platform as well as ties to Coleman AI services. The delivery date for phase three was unspecified.

MyPOV on Birst Integration

Infor is doing its best to preserve existing customer investments while providing an upgrade path to newer and more extensive Birst capabilities. On the reporting front there’s not a migration path for existing operational reports, so customers will likely gradually phase out existing tools supporting legacy reports while building new reports on Birst. The deeper and more significant value is in Infor BI, so it’s a big win for customers to see those cubes integrated with Birst’s cloud-accessible, self-service analytical environment. Overall, Birst brings significant upgrades to Infor’s analytics layer that will help support the company’s move into big data and artificial intelligence.

Related Reading:
SAP Machine Learning Plans: A Deeper Dive From Sapphire Now
Oracle Launches Adaptive Intelligent Apps for CX
Qlik Plots Course to Big Data, Cloud and ‘AI’ Innovation

 

 



Microsoft Stresses Choice, From SQL Server 2017 to Azure Machine Learning

0
0

Microsoft Ignite announcements focus on giving customers options, including on-premises, cloud, operating systems, and ML and AI frameworks.

Microsoft is getting really serious about giving customers choices. That much was clear at this week’s combined Microsoft Ignite and Envision events in Orlando and, in particular, in announcements around databases, data-integration, machine learning (ML) and artificial intelligence (AI).

Several announcements at Ignite were entirely about choice. On the hybrid front, for example, there was the general availability of Azure Stack, which lets customers put a slice of the Azure Cloud on premises — on a choice of hardware-partner racks. But that’s about infrastructure. My focus was on what Microsoft described as creating “systems of intelligence.” I’ll focus here on database, database migration, data integration, ML and AI.

SQL Server 2017

SQL Server 2017 Meets Linux, Docker

Microsoft announced that SQL Server 2017, the latest release of its flagship database, will be generally available on October 2. The big breakthrough is that this release runs on Linux as well as Windows (and the company is offering new-customer incentives including discounts for subscriptions and bundles with RedHat). Another new deployment option is within Docker Enterprise Edition containers for portability across clouds and on-premises. Beyond portability, SQL Server 2017 introduces advances in adaptive query processing, the ability to add clustered column stores for faster analytical performance and support for running models entirely in-database by way of R and Python.

Analysis: Linux is the favored operating system of the cloud, and the Windows-only constraints on Microsoft SQL Server where getting in the way of growth. Together with the docker option, these multi-platform and hybrid options should accelerate adoption.

Azure DB Migration Service Courts Oracle, MySQL

Now in private preview, Azure Database Migration Service is designed to help you migrate on-premises Microsoft SQL Server, Oracle and MySQL instances to Azure. Also in limited preview is a coming Azure SQL Database – Managed Instance, a platform-as-a-service option with VNET and private IPs support.

Analysis: There’s still only “close to full compatibility” for migration of on-premises Microsoft SQL Server to Azure SQL Database. The differences may be small, but Oracle touts its “same-DB-no-matter-where-you-deploy” advantage.

Azure Data Factory

This data-integration service for Azure, now in public preview, supports the creation, scheduling and orchestration of data-integration pipelines with the option to lift and shift SQL Server Integration Services (SSIS) packages into the cloud. Microsoft says this soon-to-be-GA service will include discounted rates for active SQL Server licensees.

Analysis:  I’d like to hear more about nuances of practical differences between Azure Data Factory and SSIS, if any, in capabilities, management, administration and the overall user experience.

Azure ML Workbench

Next-Gen Azure Machine Learning

ML and AI are the underpinning of “smart” systems that predict, spot patterns and exceptions, develop inferences about intent, and offer recommendations. Microsoft put ML/AI modeling capabilities in the cloud several years ago with Azure ML/Azure ML Studio. But that first-generation offering was strictly a cloud service run by Microsoft on Azure. The next generation of Azure ML gives organizations options through three components announced at Ignite and now in public preview.

  • Azure ML Workbench is a cross-platform client for data wrangling and managing experiments. It runs on Windows and iOS machines and is geared to developers and data scientists who need to take the first step to creating models, which is preparing the data. Users can tap into a broad range of data sources, including high-scale sources, and see samples, stats and distribution information about that data. The tool can learn the clean-up and normalization steps you want to take by example and then repeat them at scale. These steps are recorded for data transparency and lineage. From there you can use the data for your modeling experiments.
  • Azure ML Experimentation service is built supporting collaborative model development at scale. It uses Git repositories and a command-line tool to manage model experimentation and training. It tracks the code, configurations and data used in experiments as well as the models, log outputs, key metrics and the history of how those models evolve. This ensures transparency around models over time, which is often a requirement in regulated environments.Providing choice, the Experimentation service supports Python and an array of frameworks, including Tensorflow, Caffe, PyTorch, MXNet and DIGITS as well as Microsoft’s own CNTK and Microsoft Cognitive Toolkit. There are also plenty of deployment choices. Docker containers are used for portability to many environments while maintaining model and data governance, auditability and visibility. Experiments can run locally or remotely, on general-purpose VMs, scale up on Data Science VMs, scale out on Spark (in Azure HDInsight), and can even run on GPU-accelerated VMs.
  • Azure ML Model Manager service is for deployment and operationalization, supporting hosting, versioning, management and monitoring. Here, too, there are many more choices, including in-database in SQL Server 2017, in VMs, on Spark, in the Azure cloud and anywhere you can run Docker containers.

Analysis: Together all these options give data scientists and developers yet more flexibility around where they do their experimentation, training of models and operational scoring. Significantly, there’s more choice on frameworks, with Microsoft executives saying that algorithms shouldn’t matter – use whatever is best for the task at hand. Docker is the primary means of model portability, but Microsoft says deployment can be as simple as a single line of code while also giving Docker power users options to tune and tweak the deployment. You can also bring assets directly onto local machines, but you lose trace-ability. The whole idea here is supporting and bringing visibility to the entire, end-to-end lifecycle at scale. That’s a must-have for banks, insurance companies and a growing list of organizations that are doing predictive, machine learning and AI modeling at scale.

My Take on Ignite 2017

There were so many more announcements at Ignite that will make a big impact in the near term (like global-scale CosmosDB) and over the long term (like Microsoft’s work on quantum computing). The overall theme was choice, with Microsoft offering an impressive, broad spectrum of cloud, on-premises and hybrid options for data scientists, developers, data-management and governance professionals, and on up to business users and the customers of Microsoft’s customers. Many of this week’s announcements are still in preview — and there are gaps, here and there, yet to be filled. But I came away impressed.

Related Reading:
Oracle Differentiates its MySQL Cloud Service
SAP Machine Learning Plans: A Deeper Dive From Sapphire Now
SAS Takes Next Steps to Cloud Analytics


Oracle Open World 2017: 9 Announcements to Follow From Autonomous to AI

0
0

Oracle highlights machine learning and artificial intelligence for running cloud services, delivering smart applications and driving data-driven decisions. Here’s what’s coming.

Oracle’s new Autonomous Database Cloud will be cheaper, faster and, with the addition of the Oracle Cyber Security System, safer than anything from Amazon Web Services (AWS). At least that’s the assertion Oracle Executive Chairman and CTO Larry Ellison wants everyone to remember from last week’s Oracle Open World 2017 (OOW17) event in San Francisco.

Whether Oracle’s claims are fair and accurate comparisons remains to be seen, as the first release of the Oracle Autonomous Database, through a Data Warehouse Cloud Service, won’t be available until December. Count on it being at least a few more months before independent reviewers can do independent tests against rival cloud services.

@Oracle, #OOW17, #analytics

It should be about that same time – six months from now – that several other data-related announcements from OOW17 will actually be available. Some notable OOW17 announcements are generally available today, such as the Oracle Big Data Cloud, Oracle Event Hub and Stream Analytics Cloud services, and the Oracle Analytics Cloud Data Lake Edition. But whether it’s the next round of Adaptive Intelligent Apps, coming artificial intelligence (AI) and machine learning (ML) Platform as a Service capabilities, or a series of Oracle Analytics Cloud upgrades, many of the more interesting announcements from OOW17 will emerge over the next three to six months.

Here’s a rundown on 9 announcements to follow over the coming months.

Oracle Autonomous Database Cloud

Winning the cloud war is, of course, crucial for Oracle. Oracle has 480,000-plus customers, and it’s number-one product, hands down, is Oracle Database. If there’s a cloud equivalent to the domino theory, it’s that as cloud database selections go, so go the rest of a customer’s cloud choices. Thus, Ellison’s better-faster-cheaper performance and cost claims were on display all over OOW17. He also told attendees that with discounts, the database service will start at as little as $300 per month — though the 1-CPU to 1 terabyte-of-data specification seemed anemic, to say the least.

In CEO Mark Hurd’s keynote we heard that only about 14% of production workloads are now running in public clouds, but given growing cloud momentum he predicted that 80% of production workloads will be in the cloud by 2025. The lion’s share  of today’s database marketplace is Oracle’s to lose, so there’s huge pressure to prevent customers from even thinking about alternatives like Amazon RedShift or Amazon Aurora. The (according-to-Oracle) cost and performance claims flashed all over OOW17 were a “stick with Oracle” message to on-premises customers now considering the cloud.

Analysis: It’s important to recognize that the Oracle Autonomous Database Cloud is just that, a cloud-based service. Oracle Database 18c software, when it arrives, won’t have inherent Autonomous capabilities. To be Autonomous it has to be delivered as a service by Oracle. The same goes for the on-premises deployment option, which won’t be Autonomous unless it’s delivered “Cloud at Customer” style, with an Oracle Cloud Machine deployed in customer data centers but managed by Oracle.

Oracle has been talking for at least a couple of years about the efficiency it can offer through automation in the cloud, but the Autonomous Database Cloud is said to bring these advantages to a whole new level. Time will tell just how much extra oomph the ML-driven Autonomous tuning and optimization delivers compared to like-for-like Oracle 12c database services.

As for those tests results, the comparison of Oracle Database on Oracle Cloud vs. Oracle Database on Amazon is a shoe-in for the home team given Oracle’s ability to run the database on Exadata, which is not an option for Amazon and which which pushes down query processing to the storage layer, and thereby reduces the load,  before the query even gets to the database engine. Oracle DB on Exadata vs. Redshift on Amazon is more of an apples to apples comparison. Here’s where I’m eager to see independent test results.

Assuming that ML-driven database automation offers advantages – and I’m sure there will be many — the good news is that Oracle has a slew of other Autonomous database services in the pipeline, including Autonomous OLTP, expected next June, and Autonomous NoSQL and Graph database services likely to show up by OOW18.

Oracle Adaptive Intelligent Apps

Announced rather quietly at OOW16, Oracle Adaptive Intelligent Apps are a family of cloud-based, machine-learning powered apps that are integrated with Oracle cloud applications. The company spent the first half of 2016 putting the required machine learning data pipelines in place. Using a combination of customer-specific SaaS data and third-party enrichment data from the Oracle Data Cloud, Adaptive Intelligent Apps will deliver customer-tailored recommendations that will improve decisions, outcomes and business results.

Generally available today are Next Best Offers and Recommendations, a subset of Adaptive Intelligent experiences coming to the Customer Experience (CX) Cloud. Following the roadmap laid out last year, Oracle announced Adaptive Intelligent Apps for HR, ERP and Supply Chain Management at OOW17. The company said it expects this next wave of smart apps to be released within the next 12 months.

Analysis: Oracle Adaptive Intelligent Apps for CX are just getting out of the gate. Two customers on hand at OOW17, Team Sportia of Sweden and Moleskin of Italy, both said their deployments were just getting started. This is later than I anticipated in my 2016 report on Adaptive Intelligent Apps, but Oracle always conservatively said these apps would debut “within the next 12 months.” Oracle’s chief rival on this front is, of course, Salesforce Einstein, which saw lots of splashy announcements in 2017. I’m anxious to hear testimonials and deployment details from Salesforce Einstein customers at the upcoming Dreamforce event in early November.

I’ll be surprised to see new Oracle Adaptive Intelligent Apps outside of the CX arena  through the first quarter of 2018. I’ve learned to be cautious, so I’m guessing those HR, ERP and SCM smart apps are at least six months away and likely to debut in limited release.

Oracle Big Data and AI Advances

Oracle is in some cases keeping up and in some cases catching up with market leaders on the big data and artificial intelligence fronts. This summer Oracle announced the Oracle Big Data Cloud, which is a big data platform based on Hadoop and Spark and closely aligned with the ODPi standard also used by Hortonworks, Microsoft and IBM. Oracle’s previous offering, the Big Data Appliance based on Cloudera, is still available both on-premises or as a hosted service. But the future focus is clearly on Oracle Big Data Cloud, which separates storage and compute decisions and offers object storage as a low-cost alternative for high-scale data lakes.

To address streaming, real-time applications Oracle has added the Oracle Event Hub, which is based on open source Apache Kafka, for routing and processing. Oracle Stream Analytics is a rewrite of the company’s complex event processing technology that now runs on Apache Spark.

On artificial intelligence, Oracle President Thomas Kurian introduced a new AI & ML PaaS that will offer GPU compute capacity (both bare metal and VM) and a variety of open source AI frameworks, including Caffe, Keras and Tensorflow. Developers will be able to work with a variety of languages and notebooks.

Oracle AI & ML PaaS

Analysis: The Oracle Big Data Cloud is generally available immediately and puts Oracle more in step with the big data services available from AWS and Microsoft Azure. The Event Hub and Stream Analytics services are also both generally available today and fill a gap that Oracle had versus the AWS Kinesis portfolio and the Event Hubs and Stream Analytics on Microsoft Azure.

As for the AI & ML PaaS, it was announced at OOW17, but it’s not yet available (as per Oracle’s site). Based on the roadmaps I’ve seen, I’d expect the AI & ML Pass to be available within three to six months. AWS, Google Cloud Platform and Microsoft have all had GPU capacity available for some time. On model development and deployment, Microsoft last month introduced the beta preview of its next-generation Azure Machine Learning portfolio that promises end-to-end model lifecycle management. In short, my take is that Oracle is still catching up on AI cloud services and capabilities.

Oracle Analytics Cloud

Here’s another area where Oracle is moving quickly to stay in step with the market. Evolving beyond the Oracle BI Cloud Service and Oracle Visualization Cloud Service announced three years ago, the Oracle Analytics Cloud combines these two services and adds more to create a more comprehensive collection spanning data discovery, preparation, analysis and prediction. Standard and Enterprise Edition subscriptions were previously available. Oracle introduced a Data Lake Edition at OOW17 with subscriptions based on CPUs rather than users, thereby encouraging broad adoption.

Oracle announced a series of machine-learning and natural-language-processing-based enhancements to the Oracle Analytics Cloud at OOW17 and they’ll be available over the next three to six months. Automated Data Diagnostics is an “Explain” capability that will surface hidden drivers and guide users to data and analyses that they might not know to look at. Natural Language Insights will generate plain-text analyses of salient points on a chart, helping uses to focus on what matters. Improved “Ask” Natural Language Query capabilities will support synonyms and abbreviations and dynamically correct and reinterpret queries as you type. Oracle is also working on Enhanced Data Catalog capabilities including search and navigation across metadata and social tags as well as automated recommendations on related and relevant datasets to promote discovery.

Analysis: Oracle Analytics Cloud is on a path that’s similar to Microsoft PowerBI and Azure ML. Both vendors have created comprehensive portfolios and are seeking to leverage the strengths of their respective clouds and data platforms. The updates and enhancements announced at OOW17 mostly match state-of-the art capabilities that are already available in the market. For example, Oracle’s “Explain” feature is akin to Salesforce BeyondCore and a similar feature embedded in Microsoft PowerBI. Natural Language Insights is akin to Narrative Sciences and Automated Insights capabilities that Qlik and Tableau have both leveraged. State-of-the-art natural language query is available from multiple vendors, including Microsoft, IBM (Watson Analytics) and others.

My Overall Take on OOW17

As always, I came away from OOW17 impressed by the sheer breadth of applications and technologies available from the company. Oracle may not always be the first to introduce state-of-the-art capabilities, but it’s always the top competitor cited by rival database and data-platform vendors. The company’s sheer market presence in data platforms and applications puts it in a great position to also lead in analytics and coming smart applications. The key question, as with so many vendors these days, is how successful the company will be in transitioning existing customers to the cloud.

The surprise in the applications arena has been just how many all-new, greenfield customers Oracle has won with its SaaS applications. CEO Mark Hurd insists that Oracle’s core database business is outpacing the rest of the market, but that’s a licensing and subscription measure whereas I’m seeing a lot of open source growth that can’t be measured by the same math (even if the software is commercially supported).

Refreshingly, Oracle’s big data, data integration and analytics executives all seem to be hip to the open source movement. OOW17 saw a broad embrace of open source software, from Hadoop, Spark, Kafka and Cassandra (the latter by way of a partnership with Datastax) to Python, R and various open source deep learning frameworks. There will always be that Oracle bravado about its most successful commercial offerings, but I see a company that’s increasingly moving in step with a changing world of data platforms and technologies.

MariaDB Joins Latest Honorees on my Constellation ShortLists

0
0

Here’s why I added MariaDB — along with DataRobot, Datawatch, Domo, Kylo and Unifi — to my latest Constellation ShortLists.

Scalability, analytical capabilities and encryption: these were a few of the reasons cited by customers for embracing MariaDB that I heard at the vendor’s February 26-27 M18 user conference in New York.

MariaDB is an up-and-coming database management system (DBMS) created in 2009 by founders of MySQL. The fork came about in the wake of Oracle’s acquisition of MySQL as part of its purchase of Sun Microsystems. A big part of MariaDB’s appeal remains its MySQL compatibility, although differences have emerged in certain areas as the paths of these two database products have diverged (but more on that later).

@MariaDB, #MariaDBM18
MariaDB CEO Michael Howard announces plans for a database-as-a-service offering and MariaDB Labs research on distributed computing and machine learning.

Scalability came up in a keynote by customer Tim Yim of ServiceNow. Yim detailed ServiceNow’s massive deployment in which its multitenant, cloud-based platform runs nearly 85,000 instances of MariaDB TX. The deployment has spawned more than 176 million InnoDB tables and sustains roughly 25 billion queries per hour, despite constantly changing query patterns.

Analytics was the topic of my conversation with Aziz Vahora, head of data management at Pinger, the 12-year-old company behind the TextFree app for Wi-Fi texting and calling and the Sideline apps for adding a second phone number to smart phones. Pinger has a years-old data warehouse built on MySQL and InnoDB that it never expected to grow to its current size of seven terabytes. Maintaining analytical performance has been increasingly difficult, requiring extensive sharding and laborious data management.

Pinger considered options including Snowflake and Amazon RedShift, but when he learned that MariaDB was working on an analytical version of MariaDB, introduced last November as MariaDB AX, he signed on as a beta customer. MariaDB AX is a columnar database, which makes a huge difference in analytical performance, but the appeal to Pinger was MySQL and InnoDB compatibility, so it wouldn’t have to change any reporting or SQL code.

@MariaDB, #MariaDBM18
MariaDB AX, introduced in 2017, is a columnar version of the database that offers analytical advantages including high compression, faster querying and simplified administration.

Pinger wanted to add yet more data to the data warehouse, so it took three months to build an entirely new data pipeline to feed MariaDB AX. Once that was done, it wasn’t difficult getting data into the MariaDB AX columnstore, according to Vahora. The benefits have been many, he says, including 6X to 7X compression (so seven terabytes translated to just over one terabyte in the columnstore). With these economies Pinger plans to retain two years’ worth of data rather than six months. Most importantly, query times are 30 times to 100 times faster, depending on the data and query complexity, according to Vahora.

“In one example querying against six months’ worth of data across many users, an analysis that used to take two days took less than one hour,” said Vahora.

Encryption was the feature that attracted William Wood to MariaDB. In 2015, Wood, Director of Database Architecture at Financial Network Inc. (FNI) in St. Louis, was looking for an alternative to expensive commercial databases as the backbone of a standardized-yet-configurable application that could serve many bank customers in place of custom software. FNI is subject to the Payment Card Industry Data Security Standard (PCI DSS), which calls for encryption of data in transit and at rest. It so happened that MariaDB 10.1, released in 2015, introduced encryption at rest (ahead of several competitors that have since followed suit). FNI has since built out its standardized application, called Blueprint, as a price-competitive alternative to custom software, and Wood says the company is landing new customers as a result.

MyPOV on MariaDB

MariaDB remains compatible with MySQL 5.5 and with later releases in most respects. The latest releases have diverged in the handling of capabilities including clustering, JSON support and geospatial functions. From MariaDB’s perspective, there’s still a huge population of organizations deployed on MySQL 5.5 and compatibility isn’t much of a problem for those on more recent releases of MySQL but not implementing clustering, JSON support or geospatial functions.

MariaDB is not counting on MySQL compatibility alone. If all else is equal, why move? That’s why MariaDB is also innovating with novel approaches to scalability, performance and analytical capabilities with AX. The vendor is also going after commercial competitors. MariaDB 10.3, now in beta, introduced Oracle Database compatibility through support for of a subset of Oracle PL/SQL for MariaDB Stored Functions.

Pervasiveness is also crucial. MariaDB was already available as a cloud service as one of the six flavors of  Amazon RDS. But the reason I added it to my just-published “Constellation ShortList for Hybrid- and Cloud-Friendly Relational Database Management Systems” is that Microsoft last year joined the MariaDB Foundation and launched its own MariaDB service on Azure. This made MariaDB a fit with my ShortList requirement of being available as software, for on-premises deployment, and as a service on multiple leading public clouds.

At M18 MariaDB announced its intention to add its own database service, which is expected to debut later this year. Details weren’t available from the vendor, but I expect to see an initial launch on AWS with Azure and, most likely, Google Cloud Platform to follow. That’s the pattern I’ve seen from other independent database vendors who what to support multi-cloud strategies with cloud-portable services of their own.

Other Adds to my Constellation ShortLists

Among the other vendors I’ve added to various Constellation ShortLists are the following:

DataRobot made my Constellation ShortList for Self-Service Advanced Analytics. DataRobot has combined intuitive user interfaces with extensive automation capabilities to bring advanced analytics and extensive data-visualization capabilities to data-savvy business users as well as data scientists.

Datawatch made my Constellation Shortlist for Self-Service Data Prep. Datawatch has recast its venerable and extensive data-connection and data-transformation capabilities and has since seen steady, double-digit growth in demand for its Monarch and collaborative Monarch Swarm data-prep products.

Domo made my Constellation Shortlist for Cloud-Based Business Intelligence and Analytics. Domo was added based on its fast growth, surpassing 1,000 customers in 2017, and its progress in serving high-scale deployments with thousands of users. The company has also pushed into predictive capabilities and added support for hybrid deployments with federated data access without bulk data movement.

Kylo and Unifi made my Constellation ShortList for Data Lake Management. Kylo, which is offered by Teradata’s Think Big Analytics unit, was added based on the combination of dozens of successful deployments within large, well-known enterprises and significant contributions from a growing open source community. Unifi was added for its breadth of capabilities extending into data cataloging and self-service data preparation.

The Constellation ShortLists are published twice per year, in January and July, and are freely accessible at ConstellationR.com.

Nvidia Accelerates Artificial Intelligence, Analytics with an Ecosystem Approach

0
0

Nvidia’s GTC 2018 event spotlights a play book that goes far beyond chips and servers. Get set for the next era of training, inferencing and accelerated analytics.

“We’re not a chip company; we’re a computing architecture and software company.”

This proclamation, from NVIDIA co-founder, president and CEO Jensen Huang at the GPU Technology Conference (GTC), March 26-29 in San Jose, CA, only hints at this company’s growing impact on state-of-the-art computing. Nvidia’s physical products are accelerators (for third-party hardware) and the company’s own GPU-powered workstations and servers. But it’s the company’s GPU-optimized software that’s laying the groundwork for emerging applications such as autonomous vehicles, robotics and AI while redefining the state of the art in high-performance computing, medical imaging, product design, oil and gas exploration, logistics, and security and intelligence applications.

@nvidia #GTC18
Jensen Huang, co-founder, president and CEO, Nvidia, presents the sweep of the company’s growing AI Platform at GTC 2018 in San Jose, Calif.

On Hardware

On the hardware front, the headlines from GTX built on the foundation of Nvidia’s graphical processing unit advances.

  • The latest upgrade of Nvidia’s Tesla V100 GPU doubles memory to 32 gigabytes, improving its capacity for data-intensive applications such as training of deep-learning models.
  • A new NVSwitch interconnect fabric enables up to 16 Tesla V100 GPUs to share memory and simultaneously communicate at 2.4 terabytes per second — five times the bandwidth and performance of industry standard PCI switches, according to Huang. Coupled with the new, higher-memory V100 GPUs, the switch greatly scales up computational capacity for deep-learning models.
  • The DGX-2, a new flagship server announced at GTC, combines 16 of the latest V100 GPUs and the new NVSwitch to deliver two petaflops of computational power. Set for release in the third quarter, it’s a single server geared to data science and deep-learning that can replace 15 racks of conventional CPU-based servers at far lower initial cost and operational expense, according to Nvidia.

If the “feeds and speeds” stats mean nothing to you, let’s put them into the context of real workloads. SAP tested the new V100 GPUs with its SAP Leonardo Brand Impact application, which delivers analytics about the presence and exposure time of brand logos within media to help marketers calculate returns on their sponsorship investments. With the doubling of memory to 32 gigabytes per GPU, SAP was able to use higher-definition images and a larger deep-learning model than previously used. The result was higher accuracy, with a 40 percent reduction in the average error rate yet with faster, near-real-time performance.

In another example based on a FAIRSeq neural machine translation model benchmark test, training that took 15 days on NVidia’s six-month-old DGX-1 server took less than 1.5 days on the DGX-2. That’s a 10x improvement in performance and productivity that any data scientist can appreciate.

On Software

Nvidia’s software is what’s enabling workloads—particularly deep learning workloads–to migrate from CPUs to GPUs. On this front Nvidia unveiled TensorRT 4, the latest version of its deep-learning inferencing (a.k.a. scoring) software, which optimizes performance and, therefore, reduces the cost of operationalizing deep learning models in applications such as speech recognition, natural language processing, image recognition and recommender systems.

Here’s where the breadth of Nvidia’s impact on the AI ecosystem was apparent. Google, for one, has integrated TensorRT4 into TensorFlow 1.7 to streamline development and make it easier to run deep-learning inferencing on GPUs. Huang’s keynote included a dramatic visual demo showing the dramatic performance difference between TensorFlow-based image recognition peaking at 300 images per second without TensorRT and then boosted to 2,600 images per second with TensorRT integrated with TensorFlow.

Nvidia also announced that Kaldi, the popular speech recognition framework, has been optimized to run on its GPUs, and the company says it’s working with  Amazon, Facebook and Microsoft to ensure that developers using ONNX frameworks, such as Caffe 2, CNTK, MXNet and Pytorch, can easily deploy using Nvidia deep learning platforms.

In a show of support from the data science world, MathWorks announced TensorRT integration with its popular MATLAB software. This will enable data scientists using MATLAB to automatically generate high-performance inference engines optimized to run on Nvidia GPU platforms.

@nvidia #GTC18
In an example of GPU-accelerated analytics, this MapD geospatial analysis shows six years of shipping traffic — 11.6 billion records without aggregation — along the West Coast.

On Cloud

The cloud is a frequent starting point for GPU experimentation and it’s an increasingly popular deployment choice for spikey, come-and-go data science workloads. With this in mind, Nvidia announced support for Kubernetes to facilitate GPU-based inferencing in the cloud for hybrid bursting scenarios and multi-cloud deployments. Executives stressed that Nvidia’s not trying to compete with a Kubernetes distribution of its own. Rather, it’s contributing enhancements to the open-source community, making crucial Kubernetes modules available that are GPU optimized.

The ecosystem-support message was much the same around Nvidia GPU Cloud (NGC). Rather than offering competing cloud compute and storage services, NGC is a cloud registry and certification program that ensures that Nvidia GPU-optimized software is available on third-party clouds. At GTC Nvidia announced that NGC software is now available on AWS, Google Cloud Platform, Alibaba’ AliCloud, and Oracle Cloud. This adds to the support already offered by Microsoft Azure, Tencent, Baidu Cloud, Cray, Dell, Hewlett Packard, IBM and Lenovo. Long story short, companies can deploy Nvidia GPU capacity and optimized software on just about any cloud, be it public or private.

MyTake on GTC and Nvidia

I was blown away at the range and number of AI-related sessions, demos and applications in evidence at GTC. Yes, it’s an Nvidia event and GPUs were the ever-present enabler behind the scenes. But the focus of GTC and of Nvidia is clearly on easing the path to development and operationalization of applications harnessing deep learning, high-performance computing, accelerated analytics, virtual and augmented reality, and state-of-the art rendering, imaging or geospatial analysis.

Analyst discussions with Huang, Bill Dally, Nvidia’s chief scientist and SVP of Research, and Bob Pette, VP and GM of pro visualization, underscored that Nvidia has spent the last half of its 25-year history building out its depth and breadth across industries ranging from manufacturing, automotive, and oil and gas exploration to healthcare, telecom, and architecture, engineering and construction. Indeed, Nvidia Research placed its bets on AI – which will have a dramatic impact across all industries – back in 2010. That planted the seeds, as Dally put it, for the depth and breadth of deep learning framework support that the company has in place today.

Nvidia can’t be a market maker entirely on its own. My discussions at GTC with accelerated analytics vendors Kinetica, MapD, Fast Data and BlazingDB, for example, revealed that they’re moving beyond a technology-focused sell on the benefits of GPU query, visualization and geospatial analysis performance. They’re moving to a vertical-industry, applications and solutions sell catering to oil and gas, logistics, financial services, telcos, retail and other industries. That’s a sign of maturation and mainstream readiness for GPU-based computing. In one of my latest research reports, “Danske Bank Fights Fraud with Machine Learning and AI,” you can read about why a 147-year-old bank invested in Nvidia GPU clusters on the strength of convincing proof-of-concept tests around deep-learning-based fraud detection.

Of course, there’s still work to do to broaden the GPU ecosystem. At GTC Nvidia announced a partnership through which its open sourced deep learning accelerator architecture will be integrated into mobile chip maker Arm’s Project Trillium platform. The collaboration will make it easier for internet-of-things chip companies to integrate AI into their designs and deliver the billions of smart, connected consumer devices envisioned in our future. It was one more sign to me that Nvidia has a firm grasp on where its technology is needed and how to lay the groundwork for next-generation applications powered by GPUs. 

Related Reading:
Danske Bank Fights Fraud with Machine Learning and AI
How Machine Learning & Artificial Intelligence Will Change BI & Analytics
Amazon Web Services Adds Yet More Data and ML Services, But When is Enough Enough?

 

Cloudera Transitions, Doubles Down on Data Science, Analytics and Cloud

0
0

Cloudera has restructured amid intensifying cloud competition. Here’s what customers can expect.

Cloudera’s plan is to lead in machine learning, to disrupt in analytics and to capitalize on customer plans to move into the cloud.

It’s a solid plan, for reasons I’ll explain, but that didn’t prevent investors from punishing the company on April 3 when it offered a weaker-than-expected guidance for its next quarter. Despite reporting 50-percent growth for the fiscal year ended January 31, 2018, Cloudera’s stock price subsequently plunged 40 percent.

Cloudera’s narrative, shared at its April 9-10 analyst and influencers conference, is that it has restructured to elevate customer conversations from tech talk with the CIO to a C-suite and line-of-business sell about digital transformation. That shift, they say, could bring slower growth (albeit still double-digit) in the short term, but executives say it’s a critical transition for the long term. Investors seem spooked by the prospect of intensifying cloud competition, but here’s why Cloudera expects to keep and win enterprise-grade customers.

@Cloudera #ClouderaAC

It Starts With the Platform

Cloudera defines itself as an enterprise platform company, and it knows enterprise customers want hybrid and multi-cloud options. Cloudera’s options now range from on-premises on bare metal to private cloud to public cloud on infrastructure as a service to, most recently, Cloudera Altus public cloud services, available on Amazon Web Services (AWS) and Microsoft Azure.

Supporting all these deployment modes is, of course, something that AWS and Google Cloud Platform (GCP) don’t do and that Microsoft, IBM, and Oracle do exclusively in their own clouds. The key differentiator that Cloudera is counting on is its Shared Data Experience. SDX gives customers the ability to define and share data access and security, data governance, data lifecycle management and deployment management and performance controls across any and all deployment modes. It’s the key to efficiently supporting both hybrid and multi-cloud deployments. Underpinning SDX is a shared data/metadata catalog that spans deployment modes and both cloud- and on-premises storage options, whether they are Cloudera HDFS or Kudu clusters or AWS S3 or Azure Data Lake object stores.

As compelling as public cloud services such as AWS Elastic MapReduce may sound, from the standpoint of simplicity, elasticity and cost, Cloudera says enterprise customers are sophisticated enough to know that harnessing their data is never as simple as using a single cloud service. In fact, the variety of services, storage and compute variations that have to be spun up, connected and orchestrated can get quite extensive. And when all those per-hour meters are running the collection of services can also get surprisingly expensive. When workloads are sizeable, steady and predictable, many enterprises have learned that it can be much more cost effective to handle it on-premises. If they like cloud flexibility, perhaps they’ll opt for a virtualized private-cloud approach rather than going back to bare metal.

With more sophisticated and cost-savvy customers in mind, Cloudera trusts that SDX will appeal on at least four counts:

  • Define once, deploy many: IT can define data access and security, data governance, data lifecycle, and performance management and service-level regimes and policies once and apply them across deployment models. All workloads share the same data under management, without having to move data or create copies and silos for separate use cases.
  • Abstract and simplify: Users get self-service access to resources without having to know anything about the underlying complexities of data access, deployment, lifecycle management and so on. Policies and controls enforce who sees what, which workloads run where and how resources are managed and assigned to balance freedom and service-level guarantees.
  • Provide elasticity with choice: With its range of deployment options, SDX gives enterprises more choice and flexibility than a cloud-only provider in terms of how it meets security, performance, governance, scalability and cost requirements.
  • Avoid lock-in: Even if the direction is solidly public cloud, SDX gives enterprises options to move workloads between public clouds and to negotiate better deals knowing they won’t have to rebuild their applications if and when they switch providers.

MyPOV on SDX

The Shared Data Experience is compelling, though at present it’s three parts reality and one part vision. The shared catalog is Hive and Hadoop centric, so Cloudera is exploring ways to extend the scope of the catalog and the data hub. Altus services are generally available for data engineering, but only recently entered beta (on AWS) for analytics deployments and persisting and managing SDX in the cloud. General availability of Cloudera Analytics and SDX services on Azure is expected later this year. Altus Data Science is on the roadmap, as are productized ways to deploy Altus services in private clouds. For now, private cloud deployments are entirely on customers to manage. In short, the all-options-covered rhetoric is a bit ahead of reality, but the direction is clear.

Machine Learning, Analytics and Cloud

Cloudera is counting on these three growth areas, so much so that it last year appointed general managers of each domain and reorganized with dedicated product development, product management, sales and profit-and-loss responsibility. At Cloudera’s analyst and influencers conference, attendees heard presentations by each of the new GMs: Fast Forward Labs founder Hilary Mason on ML, Xplain.io co-founder Anupam Singh on analytics, and Oracle and VMware veteran Vikram Makhija on Cloud.

Lead in Machine Learning. The machine learning strategy is to help customers develop and own their ability to harness ML, deep learning and advanced analytical methods. They are “teaching customers how to fish” using all of their data, algorithms of their choice and running workloads in the deployment mode of their choice. (This is exactly the kind of support executives wanted at a global bank based in Denmark, as you can read in my recent “Danske Bank Fights Fraud with Machine Learning and AI” case study report.)

Cloudera last year acquired Mason’s research and consulting firm Fast Forward Labs with an eye toward helping customers to overcome uncertainty on where and how to apply ML methods. The Fast Forward team offers applied research (meaning practical, rather than academic), strategic advice and feasibility studies designed to help enterprises figure out whether they’re pursuing the right problems, setting realistic goals, and gathering the right data.

On the technology side, Cloudera’s ML strategy rests on the combination of SDX and the Cloudera Data Science Workbench (CDSW). SDX addresses the IT concerns from a deployment, security and governance perspective while CDSW helps data scientists access data and manage workloads in self-service fashion, coding in R, Python or Scala and using analytical, ML and DL libraries of their choice.

@Cloudera #ClouderaAC

MyPOV on Cloudera ML. Here, too, it’s a solid vision with pieces and parts that have yet to be delivered. As mentioned earlier, Altus Data Science is on the roadmap (not even in beta), as are private-cloud and Kubernetes support. Also on the roadmap are model-management and automation capabilities that enterprises need at every stage of the model development and deployment lifecycle as they scale up their modeling work. Here’s where Azure Machine Learning and AWS SageMaker, to name two, are steps ahead of the game.

I do like that Cloudera opens the door to any framework and draws the line at data scientist coding with DSW, leaving visual, analyst-level data science work to best-of-breed partners such as Dataiku, DataRobot, H2O and RapidMiner.

Disrupt in Analytics. It was eye opening to learn that Cloudera gets the lion’s share of its revenue from analytics — more than $100 million out of the company’s fiscal year 2018 total of $367 million in revenue. One might think of Cloudera as being mostly about big, unstructured data. In fact it’s heavily about disrupting the data warehousing status quo and enabling new, SQL-centric applications with the combination of the Impala query engine, the Kudu table store (for streaming and low-latency applications), and Hive on Apache Spark.

Cloudera analytics execs say they’re having a field day optimizing data warehouses and consolidating dedicated data marts (on Netezza and other aging platforms) now seen are expensive silos, requiring redundant infrastructure and copies of data. With management, security, governance and access controls and policies established once in SDX, Cloudera says IT can support myriad analytical applications without moving or copy data. That data might span AWS S3 buckets, Azure Data Lakes, HDFS, Kudu or all of the above.

The new news in analytics is that Cloudera is pushing to give DBA types all the performance-tuning and cost-based analysis options they’re used to having in data warehousing environments. Cloudera already offered its Analytic Workbench (also known as HUE) for SQL query editing. What’s coming, by mid year, is a consolidated performance analysis and recommendation environment. Code named Workload 360 for now, this suite will provide end-to-end guidance on migrating, optimizing and scaling workloads. To be delivered as a cloud service, this project combines Navigator Optimizer (tools acquired with Xplain.io) with workload analytics capabilities introduced with Altus. Think of it as a brain for data warehousing that will help companies streamline migrations, meet SLAs, fix lagging queries and proactively avoid application failures.

MyPOV on Analytics. Workload management tools are a must for heavy duty data warehousing environments, so this analysis-for-performance push is a good thing. Given the recent push into autonomous database management, notably by Oracle, I would have liked to have heard more about plans for workload automation.

Cloudera also didn’t have much to say about the role of Hive and Spark for analytical and streaming workloads, but I suspect they are significant. I’ve also talked to Cloudera customers (read “Ultra Mobile Takes an Affordable Approach to Agile Analytics”) that tap excess relational database capacity to support low-latency querying rather than relying on Impala, Hive or a separate Kudu cluster. Hive, Spark and conventional database services or capacity fall into the category of practical, cost-conscious options that may not drive additional Cloudera analytics revenue, but it’s an open platform that gives customers plenty of options.

@Cloudera #ClouderaAC @AWScloud @Azure

Capitalize on the Cloud. As noted above, SDX and the growing Altus portfolio are at the heart of Cloudera’s cloud plans. Enough said about the pieces still to come or missing. I see SDX as compelling, and it’s already helping customers to efficiently run myriad data engineering and analytic workloads in hybrid scenarios. But as a practical matter, many companies aren’t that sophisticated and are choosing to keep things simple with binary choices: X data and use case on-premises and Y data and use case in the cloud. Indeed, one of Cloudera’s customer panel guests acknowledged the importance of avoiding cloud lock in; nonetheless, he said his firm is considering the “simplicity” versus data/application portability tradeoffs of using Google Cloud Platform-native services.

MyPOV on Cloudera Cloud. Binary thinking is not the way to harness the power of using all your data, and it can lead to overlaps, redundancies and need of moving and copying data. Nonetheless, handling X on premises and Y in the cloud may be seen as the simpler and more obvious way to go, particularly if there are natural application, security or organizational boundaries. Cloudera has to execute on its cloud vision, develop a robust automation strategy and demonstrate to enterprises, with plenty of customer examples, that the SDX way is simpler and more cost-effective way to go and a better driver of innovation than binary thinking.

Related Reading:
Nvidia Accelerates AI, Analytics with an Ecosystem Approach
Danske Bank Fights Fraud With Machine Learning and AI
Ultra Mobile Takes an Affordable Approach to Agile Analytics

Oracle Modern CX Spotlights Customer Data Platform, AI Accelerator

0
0

Oracle previews CX Unity, ‘next-gen’ sales UI, and DataFox upgrades. The  investments are promising, but don’t dismiss data lakes so easily.

“You have to empower whoever gets to the customer first.” This is the key message that Rob Tarkoff, executive VP and general manager of Oracle CX Cloud, shared at Oracle Modern Customer Experience (CX), March 19-21, in Las Vegas.

Tarkoff’s point was that customers rarely follow prescribed, linear journeys, so everybody must be armed with customer insight and context whether they’re in marketing, sales, commerce or service.

To gain contextual insight you need data, and that’s why Oracle announced Oracle CX Unity last October at Oracle Open World. Billed as a “customer intelligence platform,” CX Unity is designed to provide “a comprehensive view into customer interactions across channels and applications.”

#ModernCX, @Oracle
Oracle CX Unity is the company’s coming customer data platform.

CX Unity is Oracle’s entry into the emerging customer data platform (CDP) market, a category first defined in 2013 and that gave rise to The Customer Data Platform Institute (CDPI) in 2016. Search through CDPI’s directory and you won’t find any big tech vendors — just pioneers and startups mostly emerging from the marketing technology space. That’s about to change, with both Oracle and, on March 25, Salesforce, announcing their intention to offer CDPs. And it’s a safe bet that Adobe won’t be far behind, last year having announced a joint initiative with Microsoft and SAP to work together on an open customer data standard.

As defined by CDPI, a CDP is “packaged software that creates a persistent, unified customer database that is accessible to other systems.” The platform benefits marketers because it “allows a faster, more efficient solution than general purpose technologies that try to solve many problems at once.” The need for CDPs emerged as marketers started pursuing mass personalization at scale, and it’s only getting more intense as customers expect you know them and their most recent dealings with your company in real time.

Attendees heard lots of promises about the advantages that CX Unity will bring. Oracle has announced availability for CX Unity by late summer. It’s only natural for Oracle to deliver a CDP, but unlike marketing-only vendors, it has a duty to reconcile this offering with the company’s MDM and data lake offerings (as I discuss below).

As for the “next-generation” sales experience demonstrated at Modern CX, the description made it clear that it was an early vision statement — so not something I would expect to see in 2019. The demo combined automated voice-to-text call transcription/logging and contextual, AI-based recommendation capabilities. I’ve seen similar capabilities both from Oracle partners, like the OppSource Sales Engagement app, which is integrated with Oracle Sales Cloud, and from Oracle rivals (in pilot release).

The third big highlight from Modern CX for me was DataFox, which was acquired last October and mentioned by nearly every executive who took the stage at the event. Think of Oracle DataFox as an AI accelerator that will bolster the smart-recommendation capabilities of Oracle’s CX and ERP clouds, mostly through Oracle Adaptive Intelligent Apps.

On the sales front, DataFox studies the companies you sell to successfully in a business-to-business context using machine learning and then finds lookalikes. It also prioritizes leads and next steps and it recommends talking points to help salespeople accelerate their selling. DataFox look-alike capabilities will be used in the ERP context to avoid supplier risk and to spot best-fit alternative suppliers.

MyPOV on Oracle Modern CX Announcements

DataFox was clearly a good acquisition for Oracle, and I particularly like the “human-in-the-loop” (Mechanical-Turk-like) training and exception-handling capabilities. The next-gen sales demo struck me as “we’re working on it” trail balloon, but if it makes it to market in 2019 Oracle will, indeed, be among the fast followers on AI-augmented sales.

At the center of Modern CX was CX Unity, which executives described as “three years in the making.” I’m hoping for conversations with early beta customers by Oracle Open World in September.

@ConstellationR, #ModernCX

I have to admit, it has to be a bit more challenging for a vendor like Oracle to jump into the CDP market. Unlike marketing-only vendors, I would think Oracle will be held to a higher standard to reconcile its CDP with its master data management (MDM), data warehouse and data lake offerings. I asked one Oracle exec about how CDPs and customer MDM will coexist, and he described them as complimentary, which they should be. But he went on to largely dismiss MDM as only relevant in a legal-liability context, resolving identities for ERP transaction purposes. In my view that underplays the value of MDM and you’ll want your steward-resolved customer MDM identities synchronized with your CDP.

I also wasn’t entirely comfortable with how easily Oracle’s CX executives picked on data lakes as a poor alternative to what CX Unity will offer. It’s all well and good for marketing types to try to put their arms around “customer” data, but as CDPs start to grow, I think larger enterprises and others with incumbent data warehouses and data lakes will start to see overlaps and redundancies.

What is and what isn’t customer-related behavioral data, and who decides which data to store and where? I think customers will expect more guidance (and, perhaps, integrations) from companies like Oracle than they would from marketing-focused startups to help them rationalize investments in data warehouses, data lakes and the emerging customer data platform.

Related Reading:
Salesforce Dreamforce 2018 Spotlights Identity, Integration, AI and Getting More For Less
Microsoft Steps Up Data Platform and AI Ambitions
Cloudera Transitions, Doubles Down on Data Science, Analytics and Cloud

C3.ai Highlights AI Platform Adoption, Scores IBM Partnership

0
0

C3.ai announces IBM systems integration alliance as customers Shell, 3M, Bank of America and others detail AI and IoT progress.    

There are plenty of tools and point solutions that address bits and pieces of the challenge of delivering artificial intelligence (AI) and Internet of things (IoT) applications. C3.ai’s focus is on delivering an end-to-end platform for developing, deploying and running these applications in production at scale.

Whether customers use every aspect of the C3.ai platform or not, big enterprise-scale companies seem to be attracted by that promise of quickly developing and running innovative, data-driven applications at scale. There was plenty of evidence of that fact at C3.ai’s February 25-27 Transform conference in San Francisco, where customers including Bank of America, Shell, 3M and Engie detailed their deployments.

C3.ai’s cloud-first platform is comprehensive, addressing the needs of developers, data engineers and data scientists, and the operational teams challenged with bringing applications into production at scale. Most of the components are based on open source software, such as PostgreSQL, Cassandra, Kafka and Hadoop. Yet the platform is also designed to be modular and open, so customers can swap in preferred tools and components, whether those might be integrated development environments, data platforms, AI and machine learning (ML) frameworks and tools, or DevOps components (see integrations slide, below).

C3 AI Suite
The C3.ai platform is comprehensive and built on open-source software, yet it’s also modular, so customers can use favored components. (Source: C3.ai)
C3ai Connectors Delivered in 2019
C3.ai offers connectors and integrations to these and other popular integrated development environments, frameworks, suites, tools and DevOps options. (Source: C3.ai)

The core, differentiating aspect of the C3.ai platform is its “type system” architecture, which ensures development speed, repeatability and scalability. The type system uses metadata to represent everything in the end-to-end development-to-deployment process: all data and data sources, underlying storage technologies, data science models, data-processing services, applications and application services. Everything is represented in a consistent, abstracted way that hides the complexity of behind-the-scenes technologies and that simplifies application development, deployment and ongoing optimization.

C3.ai offers its platform as a service running on Microsoft Azure, but it can also run on private clouds or other public clouds. It’s also possible to straddle hybrid and multi-cloud deployment, with clustered, multitenant instances running on-premises and on one or more public clouds.

C3.ai’s combination of agility and scalability, along with flexibility and openness to third-party tools, seems to be gaining traction. At Transform 2020, company founder and CEO Tom Siebel (and also the founder and longtime CEO of Siebel Systems) reported that C3.ai is on track for a 143% increase in bookings to $447 million in fiscal year 2020, up from $184 million in FY 2019. That’s a respectable scale for an 11-year-old company with just under 500 employees. Yet, to realize the company’s ambitions, Siebel acknowledged that C3.ai will need the help of larger partners.

Tom Siebel
C3.ai founder and CEO Tom Siebel keynoting at the company’s February 25-27 Transform event in San Francisco. (Source: Doug Henschen)

To that end, Siebel announced at Transform C3.ai’s latest partnership, a strategic alliance with IBM Services to become the company’s first preferred global systems integrator. The two companies said the deal will “fast-track the delivery of enterprise-scale industry and domain specific AI applications.”

The IBM deal follows on the heels of C3.ai’s 2019 joint venture partnership with Baker Hughes to jointly deliver C3.ai’s AI platform and applications, along with combined industry and technology expertise, to the oil and gas industry. And in 2018, C3.ai announced a strategic partnership with Microsoft to use Azure as its preferred public cloud platform (although it also announced separate partnerships that year with Amazon Web Services and Google Cloud Platform).

Daniel Jeavons - Shell
Daniel Jeavons, Shell’s general manager of data science, details the company’s data-driven predictive maintenance and optimization apps built on C3.ai. (Source: Doug Henschen)

What most impressed me at C3.ai’s event was the number and depth of customer presentations detailing extensive deployments and ambitious deployment plans:

  • Shell executive Daniel Jeavons, general manager of data science, updated attendees on a now-two-year-old C3.ai deployment that started with predictive maintenance IoT applications that now predict and prevent unplanned failures of literally thousands of valves and compressors. These now-monitored assets are in use in onshore and offshore oil and gas production facilities in more than 20 locations around the globe. Avoidance of outages at just one of these locations is, thus far, credited with saving $2 million. Next up, Shell is using the same data foundation and reusable components to develop a power-optimization application for electronic submersible pumps.
  • 3M executive Jennifer Austin, manufacturing & supply chain analytics implementation leader, detailed progress on data-science-driven inventory optimization, pricing analytics and supply chain risk-management applications developed and delivered over the last two years. Austin acknowledged that 3M is still working on the challenges of change management user adoption, in certain areas. But the inventory optimization application alone is expected to generate as much as $200 million in annual savings while improving service levels.
  • Bank of America executive Brice Rosenzweig, co-head of the bank’s Data and Innovation Group, detailed progress on building out a more business-user-accessible data platform and early (not-yet-in-production) progress on cash management and lending optimization applications built on the C3.ai platform. A customer since last fall, Bank of America is now considering at least a dozen additional use cases that would leverage the C3.ai platform, Rosenzweig said.
Brice Rosenzweig - Bank of America
Bank of America executive Brice Rosenzweig discusses progress on cash management and lending optimization applications in development. (Source: Doug Henschen)

My Take on C3.ai’s Progress

C3.ai detailed its plans to push further into the banking and financial services sector, addressing use cases including anti-money laundering, customer churn management, cash management and securities lending and more. But executives stressed that C3.ai will emphasize “accelerants” of development and delivery rather than building out every possible application. It’s a sound approach, as financial services resist one-size-fits-all solutions for what might be differentiating capabilities. At the same time these accelerants, including reusable data objects and analytic objects, will all be built on the C3.ai platform. Thus, customers will have the advantages of rapid development and deployment, but their customized applications won’t break when C3.ai updates its platform.

A key strength for C3.ai – and one that every customer I talked to seems to be exploiting – is the openness to swap in and use preferred tools for X or Y while relying on the platform and type system architecture for abstracting complexity and running data-driven applications at scale. Bank of America, for example, use DataRobot for data science exploration while Shell uses a combination of Alteryx, Databricks and TIBCO Spotfire for rapid proof-of-concept work before bringing models and applications into production at scale on the C3.ai platform. C3.ai’s platform and ambitions are big and broad, so I see the company’s pragmatic openness to best-of-breed options and cloud-native services as sensible, realistic and in tune with the openness and flexibility that customers are after.

C3.ai executives repeatedly emphasized that its roadmap is very customer driven. In 2019 the focus was on adding customer-requested connectors and integrations, low-code development options and user-interface improvements. A Spring 2020 upgrade will deliver improvements to the data-exploration and hyper-parameter optimization features of C3.ai’s Ex Machina data science studio. The company’s Summer 2020 release will focus on the developer experience, with improvements supporting serverless computing, self-healing capabilities and sub-second deployment speeds.

With its deals with IBM, Microsoft and Baker Hughes, C3.ai is clearly scaling up its sales and delivery capacity, yet it remains a midsized company that obviously takes pride in staying close to its customers. “There’s no customer where we have a traditional arm’s-length [vendor-to-customer] relationship,” Siebel concluded in his keynote address. “All of our customers are on speed dial.”

Related Reading:

C3.ai Speeds Digital Transformation Driven by AI and IoT Applications
Salesforce Dreamforce 2018 Spotlights Identity, Integration, AI and Getting More For Less
Climate Corp. Scales Up Data Science to Power Precision Agriculture


Viewing all 35 articles
Browse latest View live




Latest Images