Big data storage

RuBAN has been independantly verified to scale to multi petabyte scale database size

All of this data streaming back to the head end from these remote devices needs to be stored somewhere. RuBAN has built in big data storage. There are separate layers that RuBAN uses to achieve this. The first is collectors which read the data coming in off the wire route it to the database. The second layer is the data storage itself. RuBAN uses Cassandra database which has a time series database wrapped around it. This allows for real time writing and real time reading of time series sensor data. There is a query cache layered on top of this which assists in intelligently serving up content to the users. E.g if an external system is calling for the same data over and over we can cache the previous results so the full query doesn’t need to be performed again and just the newest piece of data needs to be fetched. These separate layers can all reside on the same server or as the deployment grows they can be spun out into their own separate VM’s using what is called elastic scalability. RuBAN can call into the VSphere api to perform this scaling up as needs be. RuBAN’s scale has been independently verified on the VBlock architecture and for example on the VBLOCK 300 series platform can support 104 Million sensors sending data back to RuBAN. This equates to 1 peta byte of information for 1 year.

Big data storage

Related content