7月第一週

7月第一週#

June 26 ~ June 30#

June 29#

至中研院報到。

June 30#

The Linux command line for beginner https://ubuntu.com/tutorials/command-line-for-beginners
已安裝Ubuntu在虛擬機。

July 03 ~ July 07#

July 03#

參加GIS中心實習生歡迎會
Ubuntu 中文簡介 (20.04 版本，仍適用)
File permissions https://linuxize.com/post/understanding-linux-file-permissions/
Science Europe - 國際合用的研究資料管理實用指南—增訂版
Managing and Sharing your data
- https://ukdataservice.ac.uk/media/622417/managingsharing.pdf
  - Chapter: SHARING YOUR DATA - WHY AND HOW
  - Chapter: DATA MANAGEMENT PLANNING
  - Chapter: DOCUMENTING YOUR DATA

July 04#

參加GIScience in A hybrid Physical-Virtual World
參加Introduction to Research Data Management (RDM)
nano (preferred) https://www.howtogeek.com/howto/42980/the-beginners-guide-to-nano-the-linux-command-line-text-editor/
https://www.dcc.ac.uk/guidance/how-guides/how-develop-rdm-services
- Chapter 5. Components of an RDM service
2022-07-06 研究資料管理概論
2023-07-04 研究資料管理概論

July 05#

tmux (terminal multiplexer) https://linuxize.com/post/getting-started-with-tmux/
htop (interactive process viewer) https://hisham.hm/htop/
已安裝htop在虛擬機。
研究資料寄存所功能介紹
Sustainable Authorship in Plain Text using Pandoc and Markdown
【✅完成作業】找尋 2-3 篇你認為撰寫良好的資料管理方案（英文為主，不限主題），並以每篇 150 字內的篇幅，說明你覺得良好的理由。（可寫在個人工作報告內）
1. 從DMPonline觀“NTU Psychology - Culture Survey Service Evaluation”
2. 從DMPassistant觀“Open Data Practices in Engineering”

July 06#

Read Pattern of user behaviour in HUMANITIES COMMONS. BASED ON COMMONS IN A BOX
【✅完成作業】LML 練習：MD to html & PDF (English version)
- 在純文字編輯器中展示文件原始文件
- 輸出成果 (HTML)，並與原始文件比較 (並陳)
  - 搭配LAB提供的教材，在.docx到.md的過程中加入下方的資料說明，才進行後續轉檔。
```
title: Ontology-based patent landscape for advanced bike sharing services
author: Chieh Hsi Chen, Dewanti Anggrahini, Chin Yun Chang, Amy J.C. Trappey
date: January 20, 2023
fontfamily: times
```
  需要再安裝字體
  - 不過.md轉換到.html並沒有顯示這些「說明」。而在.md轉換到.pdf才有顯示這些說明。
    - html
    - pdf
- 使用 LML 的心得 (如：便利或繁瑣？轉換時的限制？遇到的問題？)
  - 轉檔時，「圖片」的路徑設定「不能直接使用網址」，而是需要「使用該圖片在本機的位置」。然而在使用HackMD的過程中為求方便，我通常直接貼圖片的網址，造成後續於虛擬機操作pandoc時的圖片無法轉換，需要再調整位址的描述方式。 –[name=偉豪]
  - 中文的轉檔更加麻煩，需要再輸入其他指令，如：xelatex。 https://sam.webspace.tw/2020/01/13/使用 Pandoc 將 Markdown 轉為 PDF 文件/
- 輸出 (HTML) 使用的指令為何？並請解釋它
  
  For output to a file, use the -o option: pandoc -o output.html input.txt [name=Pandoc User’s Guide]
Depositar操作手冊 https://docs.depositar.io/
【✅完成作業】於 depositar demo 機，建立一個可公開的測試資料集，並填妥後設資料，及上傳至少二種不同 OpenFormat 格式的資料與資源。

資料集名稱：2023暑期實習測試資料集_YourLastName

Depositar Demo: 2023暑期實習測試資料集_Chen [name=Chen (NTHU107048212)]
https://mantra.edina.ac.uk/
- Video: Primary data versus secondary data Summarized by Chatpgt
  - Primary data refers to data that is collected directly by the researcher, while secondary data refers to data that is already available and collected by someone else.
  - Collecting primary data is often more restricted due to limited resources, whereas secondary data analysis can utilize larger resources.
  - The creation of primary data may involve activities like content analysis of newspapers, which can be relatively easy and straightforward.
  - Secondary data analysis can utilize techniques like computer-assisted personal interviewing, where a handheld computer guides interviewers through a set of questions.
  - The resources involved in creating secondary data are typically more extensive than those available to an individual researcher in a university setting.
  - Quantitative research often uses secondary data due to its higher quality, as it is based on well-drawn random samples and can employ techniques like stratified or cluster sampling.
  - Cognitive testing is conducted on the questions used in secondary data analysis to ensure they measure the intended variables accurately.
  - Technological advances have made centrally created data more easily accessible through the internet, eliminating the need to request physical copies like computer tapes. Researchers can now download datasets online within minutes.
- Video: Limitations of secondary data Summarized by Chatpgt
  - Lack of control over the questions: One of the main limitations of secondary data is that the researcher has no control over the questions that were asked during data collection. The questions may not align perfectly with the researcher’s specific interests or research objectives.
  - Potential mismatch between variables: Researchers must be aware of the gap between what the variables in the secondary dataset measure and what they ideally wanted to measure. It is crucial to understand the limitations and discrepancies in the measurement of variables and not pretend they measure something they do not.
  - Time-consuming organization: Analyzing secondary data can be time-consuming due to the need to thoroughly understand the documentation that accompanies the dataset. Researchers must invest time in comprehending the sampling methods, variable definitions, treatment of missing values, handling of “don’t know” responses, and question routing to different types of respondents in the survey. These details are vital for correctly interpreting the results of the analysis.
  - In summary, the limitations of secondary data include lack of control over questions, potential mismatch between variables, and the time-consuming process of organizing and understanding the dataset documentation.
ISO 19115-1:2014 Geographic information — Metadata
ISO 19115 Topic Category
Markdown文件
Quick reStructuredText
Lightweight markup language (LML)
- Markdown https://markdown.tw/
- reStructuredText (ReST) https://docutils.sourceforge.io/docs/user/rst/quickstart.html
CSV https://www.computerhope.com/issues/ch001356.htm
Tool: pandoc
- https://pandoc.org/installing.html#linux
- https://programminghistorian.org/en/lessons/sustainable-authorship-in-plain-text-using-pandoc-and-markdown
- https://www.dcc.ac.uk/guidance/how-guides/how-develop-rdm-services
  - Chapter 5. Components of an RDM service
- 【✅完成作業】Linux command line 練習: 完成「第22關」
- “Command Line Challenge” — Javad Solves help by Chatgpt and google
- Read VNCserver

July 07#

【✅完成作業】(延伸作業，選擇性) SSG 練習：使用 Pelican，將上述 LML 文章發布成 Blog 網站
- 發現雖然已經cp文件(MDtoHTMLPDF .md)到project資料夾，但是content仍然沒有任何內容。
- 再將文件(MDtoHTMLPDF .md)mv到content資料夾
- 順利輸出第1篇文章(MDtoHTMLPDF .md)到blog
- 順利輸出第2篇文章(CHNtans .md)到blog
- 發現虛擬機針對兩篇文章的image的本機位置出現warning，而無法順利輸出圖片，嘗試調整成直接將圖片的本機位置，調整成Google網頁上的link 。
  
  copy image link
- 順利輸出文字及圖片

7月第一週

Contents

7月第一週#

June 26 ~ June 30#

June 29#

June 30#

July 03 ~ July 07#

July 03#

July 04#

July 05#

July 06#

July 07#

Addtional Helpers from Google#

Ubuntu#

HackMD+Markdown語法#

Linux指令#