CS3MESH4EOSC is kicking off its 2nd Podcast episode, entitled “Data Science Environment, What is it and what are the Main Benefits?” This episode is focused on the Data Science Environments, a data service that will be integrated in the Mesh the main assets of the CS3MESH4EOSC project.
Data Science Environments
This specific Data Service is all about the integration of data science environments into the federated Science Mesh, in order to facilitate collaborative research and enable cross-federation sharing of computational tools, algorithms and resources.
But what is the main functionality? Data Science Environments is accessible via the web interface at the remote sites of researchers to enable them to work on algorithms and data processing programs interactively.
The objective is that the users will be able to access remote execution environments to replay (and modify) analysis algorithms without the need to set up upfront accounts in the remote system.
The functional integration with EFSS (Enterprise File Sync and Share) such as:
· Interactive features: advance from current JupyterHub to JupyterLab with collaborative notebook editing, explore interactive widgets such as those provided by QuantStack Voila, etc.
· Jupyter native - interfaces for OCM sharing
· Connection to code repositories such as Git-based or CVMFS-based and lightweight runtime environments similar to mybinder.org
· Interface to computational resources (such as BigData Spark, HPC, batch and Grid clusters).