[ML] General guide on ML

Loss on training data large: model bias -> add features optimization -> change optimization methods small: loss on testing data large: overfitting: (1) more training data, data augmentation (2) make model simpler small: mismatch

<span title='2024-01-14 14:31:56 +0800 +0800'>January 14, 2024</span>&nbsp;·&nbsp;1 min&nbsp;·&nbsp;Rain Hu

[Leetcode] Arrangement

題號 題目 類別 題解 1 Two Sum Hash Table Hash Table 2 Add Two numbers Linked List Recursion 201 Bitwise AND of Numbers Range Bitwise Operation Lowbit = x&(~x+1) 204 Count Primes Math Thetory The Sieve of Eratosthenes 408 Valid Word Abbreviation Two Pointers 字串轉數字

<span title='2023-10-06 16:30:47 +0800 +0800'>October 6, 2023</span>&nbsp;·&nbsp;1 min&nbsp;·&nbsp;Rain Hu

[LeetCode] 132. Palindrome Partitioning II

132. Palindrome Partitioning II Hardness: \(\color{red}\textsf{Hard}\) Ralated Topics: string, dynamic programming 一、題目 Given a string s, partition s such that every substring of the partition is a palindrome. Return the minimum cuts needed for a palindrome partitioning of s. Example 1: Input: s = “aab” Output: 1 Explanation: The palindrome partitioning [“aa”, “b”] could be produced using 1 cut. Example 2: Input: s = “a” Output: 0 Example 3: Input: s = “ab” Output: 1 Constraints: ...

<span title='2023-03-18 16:10:36 +0800 +0800'>March 18, 2023</span>&nbsp;·&nbsp;2 min&nbsp;·&nbsp;Rain Hu
Oh! You closed up the window, so you cannot see raining

[Algo] 2-3. 分治法 Divide and Conquer

一、分治法 分治法,簡而言之就是分而治之,把複雜的問題分成兩個或更多個相似或相似的子問題,直到子問題可以直接求解,最後再將子問題的解做合併。 三步驟:Divide、Conquer、Merge 以 pseudo code 來表示大概像: void func(collection set) { // 子問題求解 if (base_case) { // 根據要求將子問題解合併成母問題解 do_something return; } // 將母問題分解成子問題 for (collection subset : set) { func(subset); } } graph LR; 母問題-->子問題1; 母問題-->子問題2; subgraph Divide 子問題1-->最小子問題1; 子問題1-->最小子問題2; 子問題2-->最小子問題3; 子問題2-->最小子問題4; end subgraph Conquer 最小子問題1-->最小子問題解1; 最小子問題2-->最小子問題解2; 最小子問題3-->最小子問題解3; 最小子問題4-->最小子問題解4; end subgraph Merge 最小子問題解1-->合併; 最小子問題解2-->合併; 最小子問題解3-->合併; 最小子問題解4-->合併; end 合併-->母問題解 舉例說明,河內塔遊戲: 河內塔是由三根桿子以大小不同碟片所組成的遊戲,僧人一次可以從一根桿子移動一個碟片到另一根桿子,但是小的碟片若放憂大的碟片下面會使得小的碟片裂開(也就是碟片只能由上而下從小排到大),試問將一座塔從一根桿子完整的移動到另一根桿子需要移動多少次。 思考上面的情形,以三個碟片為例,若我們要從 A 到 C 移動一座塔,我們可以將之分解成「如何把上面兩個碟片移動到 B」,因為剩下的一個大碟片,可以很簡單的從 A 移動到 C。也就是說 f3(A->C) = f2(A->B) + f1(A->C) + f2(B->C)。 再更進一步,f2(A->B) 和 f2(B->C) 其實就是移動兩個碟片到另一座塔,所以可以分解成 f2(A->C) = f1(A->B) + f1(A->C) + f1(B->C),至此,我們已經把 f3 都分解成可以代表一次移動的最小子問題的解 f1 了: graph TD; A[f3,A->C] B[f2,A->B] C[f1,A->C] D[f2,B->C] E[f1,A->C] F[f1,A->B] G[f1,C->B] H[f1,B->A] I[f1,B->C] J[f1,A->C] A-->B A--->C A-->D B-->E B-->F B-->G D-->H D-->I D-->J 故我們可以以數學方式證明: \(\begin{array}{l} T(n)=T(n-1)+T(1)+T(n-1)=2T(n-1)+T(1)\\ T(n-1)=T(n-2)+T(1)+T(n-2)=2T(n-2)+T(1)\\ T(n)=2[2T(n-2)+T(1)]+T(1)\\ T(n)=2\times2T(n-2)+(1+2)T(1)\\ T(n)=2^k\times T(n-k)+(1+2+…+2^k)T(1)\\ 令k=n-1\\ T(n)=2^{n-1}\times T(1)+(1+2+…+2^{n-1})T(1)\\ T(n)=2^{n-1}\times T(1)+\frac{2^{n-1}-1}{2-1}T(1)\\ T(n)=(2^n-1)\times T(1) \end{array}\) 得所需要的移動次數為 \(2^n-1\) 次 分治法的特色 要解決的問題有一定的規模 該問題可以分解成若干個規模較小的問題 可以找到一個 base case,可以直接求解(如上述數學證明的\(T(1)\)) 分解出來的子問題都是相互獨立的。(若有相依性,則無法使用分治法) 分治法的時間複雜度 將規模為 n 的問題分為 k個規模為 n/m 的子問題去解,那麼可以得到 \(T(n)=kT(n/m)+f(n)\) 二、分治法的應用 1. 二元搜索法 Binary Search 令有一已排序的數列,欲查找該數列中是否有數值 x。 由於該數列已經過排序,所以我們無需遍歷整個數列,我們可以選擇每次挑選數列的中間值,若目標比中間值大,則選擇大的那側再繼續做篩選,此法稱為二元搜索法,其時間複雜度可以從線性搜索法的 \(O(n)\) 降低到 \(O(n\log n)\)。 int binarySearch(vector<int>& nums, int target) { int left = 0, right = nums.size()-1; while (left <= right) { int mid = left + (right-left)/2; if (nums[mid] == target) return mid; else if (nums[mid] < target) left = mid + 1; else if (nums[mid] > target) right = mid - 1; } return -1; } 2. Strassen 矩陣乘法 試做一個矩陣\(A\)與矩陣\(B\)內積。 \( A=\left[\begin{matrix}A_{11}&A_{12}\\A_{21}&A_{22}\end{matrix}\right], B=\left[\begin{matrix}B_{11}&B_{12}\\B_{21}&B_{22}\end{matrix}\right], C=\left[\begin{matrix}C_{11}&C_{12}\\C_{21}&C_{22}\end{matrix}\right],其中\\ \left[\begin{matrix}C_{11}&C_{12}\\C_{21}&C_{22}\end{matrix}\right]=\left[\begin{matrix}A_{11}&A_{12}\\A_{21}&A_{22}\end{matrix}\right]\cdot\left[\begin{matrix}B_{11}&B_{12}\\B_{21}&B_{22}\end{matrix}\right] \) 若通過一般展開可得 \( C_{11}=A_{11}\cdot B_{11}+A_{12}\cdot B_{21}\\ C_{12}=A_{11}\cdot B_{12}+A_{12}\cdot B_{22}\\ C_{21}=A_{21}\cdot B_{11}+A_{22}\cdot B_{21}\\ C_{22}=A_{21}\cdot B_{12}+A_{22}\cdot B_{22} \) 從上可得計算兩個 \(n\cdot n\) 的矩陣內積需要 兩個 \(\frac{n}{2}\cdot\frac{n}{2}\) 的矩陣做 8 次的乘法加上 4 次的加法,其時間複雜度可以表示成: \( T(n)=8T(\frac{n}{2})+\Theta(n^2)\\ T(\frac{n}{2})=8T(\frac{n}{4})+\Theta({\frac{n}{2}}^2)\\ T(n)=8\left[{8T(\frac{n}{4})+\Theta({{(\frac{n}{2}})}^2)}\right]+\Theta(n^2)\\ T(n)=8^2T(\frac{n}{4})+\Theta(n^2)+8\Theta(\frac{n^2}{4})=8^2T(\frac{n}{4})+(1+2)\Theta(n^2)\\ T(n)=8^kT(\frac{n}{2^k})+(1+2+…+2^{k-1})\Theta(n^2)\\ 令n=2^k\\ T(n)=n^3T(1)+(\frac{n/2-1}{2-1})\Theta(n^2)\approx \Theta(n^3) \) 若使用 Strassen 演算法 同樣將矩陣\(A,B,C\)作分解,\(時間\Theta(1)\) 創建 10 個 \(\frac{n}{2}\cdot\frac{n}{2}\) 的矩陣 \(S_1,S_2,…,S_{10}\),時間\(\Theta(n^2)\) \( S_1=B_{12}-B_{22}\\ S_2=A_{11}+A_{12}\\ S_3=A_{21}+A_{22}\\ S_4=B_{21}-B_{11}\\ S_5=A_{11}+A_{22}\\ S_6=B_{11}+B_{22}\\ S_7=A_{12}-A_{22}\\ S_8=B_{21}+B_{22}\\ S_9=A_{11}-A_{21}\\ S_{10}=B_{11}+B_{12}\\ \) 遞迴的計算 7 個矩陣積 \(P_1,P_2,…,P_7\),其中每個矩陣\(P_i\)都是\(\frac{n}{2}\cdot\frac{n}{2}\)的。 \( P_1=A_{11}\cdot S_1=A_{11}\cdot B_{12}-A_{11}\cdot B_{22}\\ P_2=S_2\cdot B_{22}=A_{11}\cdot B_{22}+A_{12}\cdot B_{22}\\ P_3=S_3\cdot B_{11}=A_{21}\cdot B_{11}+A_{22}\cdot B_{11}\\ P_4=A_{22}\cdot S_4=A_{22}\cdot B_{21}-A_{22}\cdot B_{11}\\ P_5=S_5\cdot S_6=A_{11}\cdot B_{11}+A_{11}\cdot B_{22}+A_{22}\cdot B_{11}+A_{22}\cdot B_{22}\\ P_6=S_7\cdot S_8=A_{12}\cdot B_{21}+A_{12}\cdot B_{22}-A_{22}\cdot B_{21}-A_{22}\cdot B_{22}\\\\ P_7=S_9\cdot S_{10}=A_{11}\cdot B_{11}+A_{11}\cdot B_{12}-A_{21}\cdot B_{11}-A_{21}\cdot B_{12}\\\\\\ \) 藉由 \(P_i\) 來計算得到 矩陣 \(C\):時間\(\Theta(n^2)\) \( C_{11}=P_5+P_4-P_2+P_6\\ C_{12}=P_1+P_2\\ C_{21}=P_3+P_4\\ C_{22}=P_5+P_1-P_3-P_7 \) 綜合已上可得: \( T(n)=\bigg\lbrace \begin{array}{ll} \Theta(1)&若n=1\\ 7T{\frac{n}{2}}+\Theta(n^2)&若n>1 \end{array} \) 故時間複雜度可推得 \(T(n)=\Theta(n^{\log_27})\approx \Theta(n^{2.807})\) 參考來源 Wikipedia 3. 合併排序 Merge Sort 在[Algo] 0-4. 二元樹(Binary Tree)中有介紹過,合併排序跟快速排序都有著類似前序、後序的結構, 步驟: 將數列拆成若干個只有 1 個元素的子數列(因為只有一個元素,所有可以視為已排序的數列)。 將已排序的數列兩兩合併,直到所有的數列都合併完成,即完成排序。 程式碼實作:mergeSort 4. 快速排序 Quick Sort 步驟: 選定一個數當作樞紐(pivot),將小於此數的值都放到左邊,大於此數的都放到右邊。 反覆同樣動作,直到子數列只有一個數,即完成排序。 程式碼實作:quickSort 三、例題 1. 樹類問題 樹相關的問題很常有著類似的解題結構: ...

<span title='2023-01-27 10:48:42 +0800 +0800'>January 27, 2023</span>&nbsp;·&nbsp;6 min&nbsp;·&nbsp;Rain Hu

[LeetCode] 491. Non-decreasing Subsequences

491. Non-decreasing Subsequences Hardness: \(\color{orange}\textsf{Medium}\) Ralated Topics: Array、Hash Table、Backtracking、Bit Manipulation 一、題目 Given an integer array nums, return all the different possible non-decreasing subsequences of the given array with at least two elements. You may return the answer in any order. Example 1: Input: nums = [4,6,7,7] Output: [[4,6],[4,6,7],[4,6,7,7],[4,7],[4,7,7],[6,7],[6,7,7],[7,7]] Explanation: Example 2: Input: nums = [4,4,3,2,1] Output: [[4,4]] Constraints: <= nums.length <= 15 -100 <= nums[i] <= 100 二、分析 這一很典型的是一個 backtrack 的問題,只要熟悉回溯法的框架並注意終止條件與處理重覆子序列即可。 終止條件為 i == nums.size(),此時將遍歷過的 path 加入 res,但要注意題目有規定至少要 2 個元素的子序列,故需要再加入前做確認。 注意題目傳回的子序列不可重覆,故需要額外做處理。 三、解題 1. Backtracking Time complexity: \(O(2^n)\) Space complexity: \(O(n)\) vector<vector<int>> findSubsequences(vector<int>& nums) { vector<vector<int>> res; vector<int> path; dfs(nums, 0, path, res); sort(res.begin(), res.end()); // 先做排序後 for (int i = res.size()-1; i >= 1; i--) { // 從後面往前迭代 if (res[i] == res[i-1]) { res.erase(res.begin()+i); // 刪除重覆的序列 } } return res; } void dfs(vector<int>& nums, int i, vector<int>& path, vector<vector<int>>& res) { if (i == nums.size()) { // 終止條件 if (path.size() > 1) { // 滿足子序列元素大於等於2個,則加入答案 res.push_back(path); } return; } if (path.size() == 0 || nums[i] >= path.back()) { // 注意需滿足題意為上升序列 path.push_back(nums[i]); // 加入子序列 dfs(nums, i+1, path, res); path.pop_back(); // 回溯法需將元素 pop 掉 } dfs(nums, i+1, path, res); // 跳過不取 } 2. Backtracking(optimized) Time complexity: \(O(2^n)\) Space complexity: \(O(n)\) vector<vector<int>> findSubsequences(vector<int>& nums) { vector<vector<int>> res; vector<int> path; dfs(nums, 0, path, res); return res; } void dfs(vector<int>& nums, int i, vector<int>& path, vector<vector<int>>& res) { if (i == nums.size()) { if (path.size() > 1) { res.push_back(path); } return; } if (path.size() == 0 || nums[i] >= path.back()) { path.push_back(nums[i]); dfs(nums, i+1, path, res); path.pop_back(); } if (path.size() == 0 || nums[i] != path.back()) { // 處理重覆子序列 dfs(nums, i+1, path, res); } } 回目錄 Catalog ...

<span title='2023-01-20 21:39:50 +0800 +0800'>January 20, 2023</span>&nbsp;·&nbsp;2 min&nbsp;·&nbsp;Rain Hu

[LeetCode] 974. Subarray Sums Divisible by K

974. Subarray Sums Divisible by K Hardness: \(\color{orange}\textsf{Medium}\) Ralated Topics: Array、Hash Table、Prefix Sum 一、題目 Given an integer array nums and an integer k, return the number of non-empty subarrays that have a sum divisible by k. A subarray is a contiguous part of an array. Example 1: Input: nums = [4,5,0,-2,-3,1], k = 5 Output: 7 Explanation: There are 7 subarrays with a sum divisible by k = 5: [4, 5, 0, -2, -3, 1], [5], [5, 0], [5, 0, -2, -3], [0], [0, -2, -3], [-2, -3] Example 2: ...

<span title='2023-01-19 13:50:19 +0800 +0800'>January 19, 2023</span>&nbsp;·&nbsp;2 min&nbsp;·&nbsp;Rain Hu

[LeetCode] 53. Maximum Subarray

53. Maxmimum Subarray Hardness: \(\color{orange}\textsf{Medium}\) Ralated Topics: Array、Divide and Conquer、Dynamic Programming 一、題目 Given an integer array num, find the subarray with the largest sum, and return its sum. Example 1: Input: nums = [-2,1,-3,4,-1,2,1,-5,4] Output: 6 Explanation: The subarray [4,-1,2,1] has the largest sum 6. Example 2: Input: nums = [1] Output: 1 Explanation: The subarray [1] has the largest sum 1. Example 3: Input: [5,4,-1,7,8] Output: 23 Explanation: The subarray [5,4,-1,7,8] has the largest sum 23. Constraints: ...

<span title='2023-01-18 23:20:08 +0800 +0800'>January 18, 2023</span>&nbsp;·&nbsp;2 min&nbsp;·&nbsp;Rain Hu

[LeetCode] 918. Maximum Sum Circular Subarray

918. Maximum Sum Circular Subarray Hardness: \(\color{orange}\textsf{Medium}\) Ralated Topics: Array、Divide and Conquer、Dynamic Programming、Queue、Monotonic Queue 一、題目 Given a circular integer array nums of length n, return the maximum possible sum of a non-empty subarray of nums. A circular array means the end of the array connects to the beginning of the array. Formally, the next element of nums[i] is nums[(i + 1) % n] and the previous element of nums[i] is nums[(i - 1 + n) % n]. A subarray may only include each element of the fixed buffer nums at most once. Formally, for a subarray nums[i], nums[i + 1], ..., nums[j], there does not exist i <= k1, k2 <= j with k1 % n == k2 % n. ...

<span title='2023-01-18 23:07:56 +0800 +0800'>January 18, 2023</span>&nbsp;·&nbsp;2 min&nbsp;·&nbsp;Rain Hu

[LeetCode] 926. Flip String to Monotone Increasing

926. Flip String to Monotone Increasing Hardness: \(\color{orange}\textsf{Medium}\) Ralated Topics: String、Dynamic Programming 一、題目 A binary string is monotone increasing if it consists of some number of 0’s (possibly none), followed by some number of 1’s (also possibly none). You are given a binary string s. You can flip s[i] changing 0 to 1 or from 1 to 0. Return the minimum number of flips to make s monotone increasing. Example 1: ...

<span title='2023-01-17 16:36:26 +0800 +0800'>January 17, 2023</span>&nbsp;·&nbsp;2 min&nbsp;·&nbsp;Rain Hu

[LeetCode] 1519. Number of Nodes in the Sub-Tree With the Same Level

1519. Number of Nodes in the Sub-Tree With the Same Level Hardness: \(\color{orange}\textsf{Medium}\) Ralated Topics: Hash Table、Tree、Depth-First Search、Breadth-First Search、Counting 一、題目 You are given a tree (i.e. a connected, undirected graph that has no cycles) consisting of n nodes numbered from 0 to n - 1 and exactly n - 1 edges. The root of the tree is the node 0, and each node of the tree has a label which is a lower-case character given in the string labels (i.e. The node with the number i has the label labels[i]). The edges array is given on the form edges[i] = [ai, bi], which means there is an edge between nodes ai and bi in the tree. Return an array of size n where ans[i] is the number of nodes in the subtree of the ith node which have the same label as node i. A subtree of a tree T is the tree consisting of a node in T and all of its descendant nodes. ...

<span title='2023-01-12 22:56:18 +0800 +0800'>January 12, 2023</span>&nbsp;·&nbsp;3 min&nbsp;·&nbsp;Rain Hu