7月第一週#
June 26 ~ June 30#
June 29#
至中研院報到。
June 30#
The Linux command line for beginner https://ubuntu.com/tutorials/command-line-for-beginners
已安裝Ubuntu在虛擬機。
July 03 ~ July 07#
July 03#
參加GIS中心實習生歡迎會
Ubuntu 中文簡介 (20.04 版本,仍適用)
File permissions https://linuxize.com/post/understanding-linux-file-permissions/
Managing and Sharing your data
https://ukdataservice.ac.uk/media/622417/managingsharing.pdf
Chapter: SHARING YOUR DATA - WHY AND HOW
Chapter: DATA MANAGEMENT PLANNING
Chapter: DOCUMENTING YOUR DATA
July 04#
參加GIScience in A hybrid Physical-Virtual World
參加Introduction to Research Data Management (RDM)
nano (preferred) https://www.howtogeek.com/howto/42980/the-beginners-guide-to-nano-the-linux-command-line-text-editor/
https://www.dcc.ac.uk/guidance/how-guides/how-develop-rdm-services
Chapter 5. Components of an RDM service
2022-07-06 研究資料管理概論
2023-07-04 研究資料管理概論
July 05#
tmux (terminal multiplexer) https://linuxize.com/post/getting-started-with-tmux/
htop (interactive process viewer) https://hisham.hm/htop/
已安裝htop在虛擬機。
研究資料寄存所功能介紹
Sustainable Authorship in Plain Text using Pandoc and Markdown
【✅完成作業】找尋 2-3 篇你認為撰寫良好的資料管理方案(英文為主,不限主題),並以每篇 150 字內的篇幅,說明你覺得良好的理由。(可寫在個人工作報告內)
July 06#
Read Pattern of user behaviour in HUMANITIES COMMONS. BASED ON COMMONS IN A BOX
【✅完成作業】LML 練習:MD to html & PDF (English version)
在純文字編輯器中展示文件 原始文件
輸出成果 (HTML),並與原始文件比較 (並陳)
搭配LAB提供的教材,在.docx到.md的過程中加入下方的資料說明,才進行後續轉檔。
title: Ontology-based patent landscape for advanced bike sharing services author: Chieh Hsi Chen, Dewanti Anggrahini, Chin Yun Chang, Amy J.C. Trappey date: January 20, 2023 fontfamily: times
需要再安裝字體
不過.md轉換到.html並沒有顯示這些「說明」。而在.md轉換到.pdf才有顯示這些說明。
html
pdf
使用 LML 的心得 (如:便利或繁瑣?轉換時的限制?遇到的問題?)
轉檔時,「圖片」的路徑設定「不能直接使用網址」,而是需要「使用該圖片在本機的位置」。然而在使用HackMD的過程中為求方便,我通常直接貼圖片的網址,造成後續於虛擬機操作pandoc時的圖片無法轉換,需要再調整位址的描述方式。 –[name=偉豪]
中文的轉檔更加麻煩,需要再輸入其他指令,如:xelatex。 https://sam.webspace.tw/2020/01/13/使用 Pandoc 將 Markdown 轉為 PDF 文件/
輸出 (HTML) 使用的指令為何?並請解釋它
For output to a file, use the -o option:
pandoc -o output.html input.txt
[name=Pandoc User’s Guide]
Depositar操作手冊 https://docs.depositar.io/
【✅完成作業】於 depositar demo 機,建立一個可公開的測試資料集,並填妥後設資料,及上傳至少二種不同 OpenFormat 格式的資料與資源。
資料集名稱:2023暑期實習測試資料集_YourLastName
Depositar Demo: 2023暑期實習測試資料集_Chen [name=Chen (NTHU107048212)]
-
Video: Primary data versus secondary data Summarized by Chatpgt
Primary data refers to data that is collected directly by the researcher, while secondary data refers to data that is already available and collected by someone else.
Collecting primary data is often more restricted due to limited resources, whereas secondary data analysis can utilize larger resources.
The creation of primary data may involve activities like content analysis of newspapers, which can be relatively easy and straightforward.
Secondary data analysis can utilize techniques like computer-assisted personal interviewing, where a handheld computer guides interviewers through a set of questions.
The resources involved in creating secondary data are typically more extensive than those available to an individual researcher in a university setting.
Quantitative research often uses secondary data due to its higher quality, as it is based on well-drawn random samples and can employ techniques like stratified or cluster sampling.
Cognitive testing is conducted on the questions used in secondary data analysis to ensure they measure the intended variables accurately.
Technological advances have made centrally created data more easily accessible through the internet, eliminating the need to request physical copies like computer tapes. Researchers can now download datasets online within minutes.
Video: Limitations of secondary data Summarized by Chatpgt
Lack of control over the questions: One of the main limitations of secondary data is that the researcher has no control over the questions that were asked during data collection. The questions may not align perfectly with the researcher’s specific interests or research objectives.
Potential mismatch between variables: Researchers must be aware of the gap between what the variables in the secondary dataset measure and what they ideally wanted to measure. It is crucial to understand the limitations and discrepancies in the measurement of variables and not pretend they measure something they do not.
Time-consuming organization: Analyzing secondary data can be time-consuming due to the need to thoroughly understand the documentation that accompanies the dataset. Researchers must invest time in comprehending the sampling methods, variable definitions, treatment of missing values, handling of “don’t know” responses, and question routing to different types of respondents in the survey. These details are vital for correctly interpreting the results of the analysis.
In summary, the limitations of secondary data include lack of control over questions, potential mismatch between variables, and the time-consuming process of organizing and understanding the dataset documentation.
Lightweight markup language (LML)
Markdown https://markdown.tw/
reStructuredText (ReST) https://docutils.sourceforge.io/docs/user/rst/quickstart.html
Tool: pandoc
https://www.dcc.ac.uk/guidance/how-guides/how-develop-rdm-services
Chapter 5. Components of an RDM service
“Command Line Challenge” — Javad Solves help by Chatgpt and google
Read VNCserver
July 07#
【✅完成作業】(延伸作業,選擇性) SSG 練習:使用 Pelican,將上述 LML 文章發布成 Blog 網站
發現雖然已經cp文件(MDtoHTMLPDF .md)到project資料夾,但是content仍然沒有任何內容。
再將文件(MDtoHTMLPDF .md)mv到content資料夾
順利輸出第1篇文章(MDtoHTMLPDF .md)到blog
順利輸出第2篇文章(CHNtans .md)到blog
發現虛擬機針對兩篇文章的image的本機位置出現warning,而無法順利輸出圖片,嘗試調整成直接將圖片的本機位置,調整成Google網頁上的link 。
copy image link
順利輸出文字及圖片