“数据仓库”是一种数据库系统。我们现在经常说的“大数据”很多时候正是在“数据仓库”这种系统中进行查询和分析。这一集节目,我们来聊一聊数据仓库是什么、它的历史、它的关键技术,以及相关系统。
主播:斯图亚特、Sean Wang、Cat Chen
01:34 什么是数据仓库12:26 数据仓库的技术36:29 ETL :抽取(Extract)、转置(Transform)、载入(Load)43:06 数据仓库和机器学习两套数据库系统:运营系统和数据仓库数据仓库的历史互联网公司引领的数据仓库潮流数据仓库的技术
里程碑论文: Mike Stonebraker: "One size fits all": an idea whose time has come and gone (2005)列存储和运营系统技术特点的差别MapReduce及其争议。Hive开启的Hadoop生态系统中的SQL几大云数据仓库系统(Redshift、BigQuery,Azure,Snowflake)ETL :抽取(Extract)、转置(Transform)、载入(Load)
如何把数据载入数据仓库数据清洗和数据整合HTAP(Hybrid transactional/analytical processing)数据仓库和机器学习
Bill Inmon 1970年代提出这个单词? https://en.wikipedia.org/wiki/Bill_InmonIn 1988, IBM researchers Barry Devlin and Paul Murphy coined the term information warehouse, and IT shops began building experimental data warehouses. In 1991, W.H. "Bill" Inmon made data warehouses practical when he published a how-to guide, Building the Data Warehouse (John Wiley & Sons). https://web.archive.org/web/20080708182105/http://www.computerworld.com/databasetopics/data/story/0%2C10801%2C70102%2C00.htmlMike Stonebraker的里程碑论文: Michael Stonebraker and Ugur Cetintemel. 2005. "One Size Fits All": An Idea Whose Time Has Come and Gone. In Proceedings of the 21st International Conference on Data Engineering (ICDE '05).两位数据库大佬David Dewitt and Mike Stonebraker对MapReduce的批评: ”MapReduce: A major step backwards” https://homes.cs.washington.edu/~billhowe/mapreduce_a_major_step_backwards.htmlImage by Pexels from Pixabay
Exzel Music Publishing (freemusicpublicdomain.com)
Licensed under Creative Commons: By Attribution 3.0
http://creativecommons.org/licenses/by/3.0/