MDS Assignment1 2023
MDS Assignment1 2023
Please solve the following questions and justify your answer by using Python. Show all your
analysis result including Python code in your report. Upload your “zip” file including (1) MS
Word/LaTeX pdf report (answering each question and its sub-questions) and Python code; or
(2) notebook (including answer and code), with file name: “MDS_Assignment1_ID_Name.zip”
to NTU COOL by due. The late submission is not allowed.
Dataset is related to red vinho verde wine samples, from the north of Portugal. The goal is to model wine quality
based on physicochemical tests. See Cortez et al. (2009) for more information.
Input variables (based on physicochemical tests):
1 - fixed acidity
2 - volatile acidity
3 - citric acid
4 - residual sugar
5 - chlorides
6 - free sulfur dioxide
National Taiwan University Manufacturing Data Science
Department of Information Management Instructor: Chia-Yen Lee, Ph.D.
• InvoiceNo: Invoice number. Nominal, a 6-digit integral number uniquely assigned to each
transaction. If this code starts with letter 'c', it indicates a cancellation.
• StockCode: Product (item) code. Nominal, a 5-digit integral number uniquely assigned to
each distinct product.
• Description: Product (item) name. Nominal.
• Quantity: The quantities of each product (item) per transaction. Numeric.
• InvoiceDate: Invice Date and time. Numeric, the day and time when each transaction was
generated.
• UnitPrice: Unit price. Numeric, Product price per unit in sterling.
• CustomerID: Customer number. Nominal, a 5-digit integral number uniquely assigned to
each customer.
• Country: Country name. Nominal, the name of the country where each customer resides.
That is exactly what the “market basket analysis” contains: a collection of invoices with each
line representing 1 item purchased with respect to its corresponding invoice. A collection of
several lines with the same InvoiceNo is called a transaction. You can see the online retail data
set (MDS_Assignment1_OnlineRetail.xlsx). Note that you ONLY need the first column and
second column, and transform into the “format” for extracting association rules (we don’t
consider the purchasing quantity here). Use “association rule” to find the potential patterns
which satisfy the following criterion:
National Taiwan University Manufacturing Data Science
Department of Information Management Instructor: Chia-Yen Lee, Ph.D.
Source:
https://archive.ics.uci.edu/dataset/352/online+retail
(c) (10%)根據該問題的產線,試程式撰寫建立一模擬模型(或用套裝軟體、數值分析)來
驗證,當在製品 WIP 數量超過工廠產能時,其生產週期將嚴重惡化。也就是當產線的投
料速度(投產量)大於產線的產出率,此時生產系統將處於非穩態的狀態(non-steady state)。
試用圖表呈現 WIP、CT 與 TH 之間惡化的關係。(提示:參閱講義)
Note
1. Show all your work in detail. Innovative idea is encouraged.
2. If your answer refers to any external source, please “must” give an academic citation. Any
“plagiarism” is not allowed.