大數據是最近十年最時髦的幾個詞彙之一,隨著網際網路的蓬勃發展,如今我們在各行各業、各種領域所累積的資料,幾乎已經可以用巨量來形容了。巨量的資料顯而易見的除了儲存上變成一種挑戰,如何有效的消化、分析與善用這些資料也是當今主要的課題。本課程作為導讀的課程,會先介紹大數據的由來,簡介數種當前處理大數據的程式語言與大數據在各領域中的發展潛力。
Big data becomes an emergence and interesting research field in the past decade, due to the rapidly growing in the internet. Big data denotes to not only huge amount of data but also those data growing very fast. Therefore, how to store and to make use of these data become a challenge. This is an introductive level course in big data for freshman. I will first introduce the history of big data, and then the algorithm, and computer language such as R Hadoop and spark for big data. Finally, I will talk about the potential of big data in life science, medical science, and marketing etc.
先修科目Prerequisites
1. 有程式語言的基礎會比較好,無也可
1. Any computer language.
教學方式Teaching Methods
講課
Lecturing
學生實作
Student hands-on practice
習題練習、書面報告
Exercise, Written report
評量方式Assessment
實作測驗
Practical exam
課堂參與與表現
Class involvement
參考書目Reference
1. 大數據浪潮 李德偉等著 (上奇時代)
2. 大數據挖掘 譚磊著 (上奇時代)
3. 網路+大數據 陳建英、黃演紅著 (碁峯)
4. Big Data, Data Mining, and Machine Learning Jared Dean (WILEY)
教學進度Course Schedule
2020/10/09 1.巨量資料的浪潮
2.Big Data的前身--- 資料挖掘
3.技術革命開創了Big Data時代
4.改變世界的Big Data
1. Introduction to Big data.
2. Data mining
3. The dawn of big data
4. The influence of big data
董人銓(Tung, Jen-Chuan) 2020/10/16 1. 從人臉辨識了解資料採集
2. 資料採集的基本流程
1. Introduction to datamining ---Face perception
2. How data mining works
董人銓(Tung, Jen-Chuan) 2020/10/23 1. 大數據的魅力
2. 巨量資料與資料探索
3. 巨量資料技術
1. Why big data is so charming
2. Data mining for big data
3. How to restore big data
董人銓(Tung, Jen-Chuan) 2020/10/30 1. 資料儲存
2. 傳統的資料儲存介紹
3. 雲端儲存
1. Data storage
2. Classical data storage
3. Cloud data storage
董人銓(Tung, Jen-Chuan) 2020/11/06 1.巨量資料與思維與決策方式的改變
2.探索資料
3.巨量資料的數理哲學基礎
1. Big data changes how you make decision
2. Explore big data
3. Mathematical base of big data
董人銓(Tung, Jen-Chuan) 2020/11/13 1.資料採集中的重要演算法
2.分類演算法
3.群集演算法
4.連結演算法
5.序列挖掘
1. Algorithms in data collection
2. Classification
3. Clustering
4. Apriori algorithm
5. Sequence mining
董人銓(Tung, Jen-Chuan) 2020/11/20 1. 可行性與可靠性,絕對性與相對性
2. 理性的局限與對數學的信仰
3. 資訊革命
1. Feasibility and reliability, absoluteness and relativity
2. The limitations of rationality
3. Revolution of information
董人銓(Tung, Jen-Chuan) 2020/11/27 1.R語言的歷史
2.其他資料採集工具
1. History of R language
2. Computer tools for data analysis
董人銓(Tung, Jen-Chuan) 2020/12/04 期中考
Midterm exam
董人銓(Tung, Jen-Chuan) 2020/12/11 1. 雲端運算的思維模型
2. 維基百科
3. 人工智慧
4. 資訊不對稱
1. History of Wikipedia
2. Artificial intelligence
3. Information asymmetry
董人銓(Tung, Jen-Chuan) 2020/12/18 1.網站記錄檔簡介
2.網站記錄檔處理
3.郵件記錄檔
1. Internet information log
2. Introduction to mail log
董人銓(Tung, Jen-Chuan) 2020/12/25 1.資料採集與衣物行銷
2.資料採集與餐飲行銷
3.資料採集與街景資料
1. Big data in cloth marketing
2. Big data in food marketing
3.Big data in map
董人銓(Tung, Jen-Chuan) 2021/01/01 R語言演練
Hands-on session
董人銓(Tung, Jen-Chuan) 2021/01/08 1. 資料採集與醫療服務
2. 資料採集與生產製造
1. Big data in medical serving
2. Big data in production
董人銓(Tung, Jen-Chuan) 2021/01/15 R語言演練
R language practice
董人銓(Tung, Jen-Chuan) 2021/01/22 1.資料採集與網際網路廣告
1. Big data in advertising
董人銓(Tung, Jen-Chuan) 2021/01/29 R語言演練
R language practice
董人銓(Tung, Jen-Chuan) 2021/02/05 期末考
Final exam
董人銓(Tung, Jen-Chuan)