CFD8 - Igneous Followup
Igneous had a lot to prove at Cloud Field Day 8. They had to show the delegates where they fit in the larger IT landscape now that they aren’t a traditional storage vendor. Participation in Cloud Field Day meant they saw themselves as a player in the cloud arena. Cloud can still be a term that’s hard to define especially for traditional enterprise customers. Vendors have been known to bend definitions to suit their needs. Igneous seems to fit somewhere in between traditional on-premises and cloud. Igneous enables enterprise backup on-premises file data to the cloud service provider (CSP) of their choice.
Igneous has integrations with all the major methods of storing file data. The website and marketing material make a point to call this data unstructured. Unstructured can be a bit of a loaded term. The definition used by datamation.com makes a lot of sense here. Datamation.com frames the definition based on the application which accesses the data rather than the data itself. Structured data is accessed by databases, and unstructured data is accessed by every other type of application. Others tend to view structured data as data in markup languages that use data labels like JSON-LD Standard JSON is viewed as semi-structured. It is the former definition that seems to fit Igneous the best.
Igneous’s DataProtect product connects to typical file systems like SMB and NFS on traditional enterprise storage systems. Igneous hooks into the major vendors’ systems like NetApp, Isilon (now called EMC PowerScale) and even Qumulo to enable backing up that data to the the big three CSPs (Amazon AWS, Microsoft Azure, and Google GCP). Given the fact that enterprise file storage systems are sized to support multiple terabytes to petabytes, there are many implications when it comes to the management of data at that scale.
Storage continues to evolve. There’s more density per rack unit than ever before, which means more TBs can fit within the same amount of rack space than in previous years. Better per rack RU density can contribute to cost savings. The rate at which the world produces data, however, continues to increase at an enormous rate. Igneous quotes IDC research that predicts data will grow from 33 zettabytes in 2018 to 175 zettabytes by 2025. The rate of growth is spurred on by new applications and usage patterns which collect data at the edge like Internet of Things (IoT). Long term storage of file data at the edge or in the datacenter can be costly. Moving that data to the cloud to an option like Azure Blob or AWS S3 can save costs beyond what is possible in the local datacenter or at the edge. Moreover, cloud providers offer additional tiers of deep storage that are can save even more money.
Moving data from the edge to the cloud has additional benefits. Organizations often leverage the lower cost of S3 storage combined with scalable compute to host applications that can do things that are typically difficult to implement on-prem without a large cap-ex investment. High performance computing workloads come to mind. Also, data visualization and analysis with software like Grafana is another potential use. While it sounds simple, moving multiple-TBs to petabytes of data is not trivial. Cost is rarely the concern since cloud vendors typically charge for data egress (leaving) their clouds instead of ingress (entering) the cloud. There are few alternatives. AWS provides a line of devices such as Snowcone Edge which is a portable hard drive that can be mailed to AWS all the way up to a Snowmobile which is literally a truckload of storage (100PB) that can be transported to an Amazon AWS datacenter. Igneous is a software option for backing up to the cloud within a reasonable SLA.
Although backup isn’t a particularly sexy topic, it is necessity for the majority of enterprise datasets. The DataProtect product is available as a virtual machine. There are a number of expected features. Based on the demo, Igneous hits all the major notes:
Full and Incremental Backups
Backups based around a tiered system
Leveraging compression techniques
The indexing scheme used appears to be fast and fluid at returning files that meet the metadata search criteria. That criteria includes all filenames, file extension, directory names, and other basic information. Personally, I would like to be able to also search within file contents, for example - return all files that have the word “President” in it. Also, it would be neat to add additional metadata. Deep content searching combined with customizable metadata tagging would enable flagging all files that contain PII - social security numbers, addresses, credit card numbers, etc. Extensible tagging also makes it possible to assign ownership to a department for showback or chargeback purposes. Understandably, there are some considerable technical hurdles around these ideas and would dramatically impact speed when looking at the massive amounts of data associated with Igneous’s target customers.
I wrote an article before #CFD8 about the Igneous pivot from a traditional storage company to a cloud company. What they showed at CFD8 is that it may be more accurate to view Igneous as a hybrid. I consider them a hybrid instead of pure cloud because the source formats don’t include cloud storage like AWS S3 or Azure Blob. Those formats are destinations which means Igneous is a one way trip. Cloud to cloud backup / transfer isn’t supported as an option with the current GUI configuration. Neither is repatriation of data where data is moved from the cloud back to an on-premises location. Granted, these are not common use cases, but these scenarios do exist and the absence of them means the product is an 80 percent solution. To their credit, the 80% they do is very well done based on the demo they showed. In the video above, Igneous showed a full backup of 500GB of data to S3 in about 2 minutes. That’s impressive and shouldn’t be taken lightly. The Igneous pivot from a competitive storage vendor to super fast-backup player is a neat application of their technology.
References
https://developers.google.com/search/docs/guides/intro-structured-data
https://www.datamation.com/big-data/structured-vs-unstructured-data.html