Post

Adjust the Weekly Plan with LLMs

Adjust the Weekly Plan with LLMs

Problems with the previous plan

Although the plan was good. But while I’m trying to execute it, I’ve found several realistic problems.

  • Lack of Math Knowledge(Linear Algebra, Calculus)
    so it was hard to understand the formula and explanation.
  • Different suggestions
    I tried several LLMs(ChatGPT, Claude, Grok, Deepseek), but of course all of them have gave me the different answers.
    But for me, An Undergraduate Student, it was hard to select and modify the plan they gave me.
  • Different Focus
    The LLMs have gave me very different answer when I change the terms. (e.g efficient and fast -> efficient)
    Even the focus were changing(too shallow, too many resources, out of scope)
    So I had to fixed my requirements and conditions.

New method

  1. Ask again with Format, Role, Focusing point, Requirement, My situasion
    So I rewrote the prompt mentioning some of their answer.
    1
    2
    3
    4
    5
    6
    7
    8
    9
    
     As a professional AI researcher. imagine you're making it for a student do entire questions
     [former CSV table format including data]
     keeping
     Advanced Focus: Resources go beyond undergraduate basics (e.g., introductory Python or linear algebra) and dive into theoretical and practical depths of ML/DL.
     Research-Oriented: Includes seminal papers and recent works to align with your goal of following cutting-edge research.
     Master’s Preparation: Structured to help you explore specializations (e.g., NLP, computer vision) and build a strong application profile.
     Evidence-Based: Recommendations are grounded in widely recognized materials used in academia and industry.
     [my-plan-period]
     [my-time-table]
    

    And I used the function (deep think/reason/… anyway)

  2. Let them judgement their responses
    I also requested to make CSV table of the plan.
    And then copied All of them and ask again with the function ```
    1. [CSV1]
    2. [CSV2]
    3. [CSV3] which is the best of all? and which is the best of only study? ``` I asked “only study” because their answer contained community, forum and Master preparation

Got the Satisfactory grain

for my situation, it was Deepseek>ChatGPT>Grok>Claude
also I was very amazed of Deepseek’s analytical answers. The quality was pretty high.
So, after some of formatting and adjustment. it is done!
And I don’t think I would change the plan even if I didn’t complete it, I would just re-schedule to finish them.

Math Plan Table

Focus AreaTopicsResourcesTasksDeliverablesResearch Links
Vector SpacesVector spaces, subspaces, linear combinationsFriedberg Ch. 1.1-1.3; 3Blue1Brown Video 1-3Watch 3Blue1Brown videos (1 hr)Notes on vector axioms-
Vector SpacesVector spaces, subspaces, linear combinationsFriedberg Ch. 1.1-1.3; 3Blue1Brown Video 1-3Read Friedberg Ch. 1.1-1.3 (2 hrs); Solve Friedberg Ex. 1.4 (2 hrs); Python code for linear combinations (2 hrs)Python code for linear combinations-
Linear TransformationsLinear maps, matrix representationsFriedberg Ch. 2.1-2.3; MML Ch. 2.7Read Friedberg Ch. 2.1-2.3 (3 hrs); Implement matrix multiplication in NumPy (2 hrs); Proof of linearity for a map (2 hrs)Code for matrix transformations; Proof of linearity for a map-
Bases & DimensionLinear independence, bases, dimensionFriedberg Ch. 1.5-1.6; MML Ch. 2.5-2.6Read Friedberg Ch. 1.5-1.6 (3 hrs); Solve MML Ex. 2.5 (1 hr); Basis for a dataset (e.g., MNIST pixels) (2 hrs)Basis for a dataset; Dimension analysis notes-
EigenvaluesEigenvalues, eigenvectors, diagonalizationFriedberg Ch. 5.1; MML Ch. 4.2Read Friedberg Ch. 5.1 (2 hrs); Compute eigenvalues for 3x3 matrices (2 hrs); Notes on PCA motivation (1 hr)Eigenvalue Python script; Notes on PCA motivationTurk & Pentland (1991) - Eigenfaces
SVD & Inner ProductsInner product spaces, Gram-Schmidt, SVDFriedberg Ch. 6.1-6.2, 6.7; MML Ch. 4.5Implement SVD in NumPy (3 hrs); Read MML Ch. 4.5 (2 hrs); Truncated SVD for image compression (1 hr)Truncated SVD for image compression; Orthogonal basis codeKoren et al. (2009) - SVD in Recommender Systems
Probability BasicsProbability spaces, distributionsMML Ch. 6.1-6.2; Deep Learning Ch. 3.9Read MML Ch. 6.1-6.2 (2 hrs)Gaussian sampling Python script; Probability axioms summary-
Review & PCAReview concepts, PCA introductionMML Ch. 10.2; Friedberg Ch. 6.7Summarize week (1 hr)1-page PCA summary-
Advanced SVDSVD theory, low-rank approximationsGolub & Van Loan Ch. 2.4; MML Ch. 4.6Read Golub Ch. 2.4 (3 hrs); Optimize SVD code for speed (2 hrs); Benchmarked SVD code (1 hr)Benchmarked SVD code; Notes on numerical stability-
Gradients & BackpropVector calculus, chain ruleMML Ch. 5.2-5.6; CS231n NotesDerive gradients for a neural network (4 hrs); Verify with PyTorch (1 hr); Manual gradient calculations (2 hrs)Manual gradient calculations; Autograd verification script-
OptimizationGradient descent, Lagrange multipliersBoyd & Vandenberghe Ch. 9; MML Ch. 7.1-7.3Code gradient descent for linear regression (3 hrs); Read Boyd Ch. 9 (2 hrs); Convergence plot (1 hr)Convergence plot; Lagrange multiplier examplesKingma & Ba (2017) - Adam Optimizer
ConvexityConvex sets, functions, optimizationBoyd & Vandenberghe Ch. 2-3Prove convexity for SVM loss (2 hrs); Read Boyd Ch. 2 (3 hrs)Convexity proofs; Notes on SVM objective-
Probability Deep DiveExponential family, MLEDeep Learning Ch. 3.9-3.12; MML Ch. 6.6Implement MLE for Gaussian (2 hrs); Read DL Ch. 3.9 (3 hrs); Exponential family summary (1 hr)MLE Python script; Exponential family summary-
NLP Math IWord embeddings, attentionAttention Is All You Need (Vaswani et al., 2017)Code self-attention with NumPy (2 hrs)Transformer attention layerVaswani et al. (2017) - Transformers
NLP Math IIGeometric intuition, manifoldsWord2Vec (Mikolov et al., 2013); MML Ch. 3Visualize word vectors with t-SNE (1 hr)t-SNE visualization codeMikolov et al. (2013) - Word2Vec
CV Math IConvolutions, CNNsImageNet Classification (Krizhevsky et al., 2012)Implement Conv2D with NumPy (4 hrs); Read Sec. 2 (1 hr); Receptive field analysis (1 hr)Conv layer code; Receptive field analysisKrizhevsky et al. (2012) - AlexNet
CV Math IIBatchNorm, gradient flowBatch Normalization (Ioffe & Szegedy, 2015)Derive BatchNorm gradients (3 hrs); Read Sec. 4.2 (2 hrs); Gradient flow notes (2 hrs)BatchNorm code; Gradient flow notesIoffe & Szegedy (2015) - BatchNorm
Research Paper IReproduce a resultChoose paper (e.g., PCA, Adam)Replicate PCA on CIFAR-10 (6 hrs)Jupyter notebook; Comparative analysisUser-selected paper
Advanced ProofsSpectral theorem, Jordan formFriedberg Ch. 6.7, 7.1Study Friedberg proofs (4 hrs); Summarize (1 hr)Proof summaries; Applications to kernels-
Measure TheoryProbability spaces, Lebesgue integralRoyden Ch. 2-3Read Royden Ch. 2 (3 hrs); Solve Ex. 2.5 (2 hrs); Sigma-algebra examples (1 hr)Measure theory notes; Sigma-algebra examples-
Specialization ElectiveChoose: NLP (BERT) or CV (ResNet)BERT (Devlin et al., 2018) / ResNet (He et al., 2015)Implement BERT embedding or ResNet block (2 hrs)Code + performance reportDevlin et al. (2018) or He et al. (2015)
Portfolio PrepGitHub, summary document-Organize code/docs (1 hr)GitHub repo-
Portfolio PrepGitHub, summary document-Write READMEs (2 hrs); Organize code/docs (2 hrs); 1-page research summary (2 hrs)1-page research summary-

ML/DL Plan Table

Focus AreaLearning ObjectivesResources (Books/Papers)Courses/Video LecturesHands-on TasksCommunity/Research EngagementDeliverables/Outputs
Supervised Learning & OptimizationMaster ML fundamentals: loss functions, gradients, SGDPRML (Ch. 1-3); SGD paper (Robbins & Monro, 1951)CS229 (Supervised Learning)Implement logistic regression from scratchJoin ML forums (e.g., Reddit r/MachineLearning)Code + report comparing SGD variants
Advanced Linear Algebra & Matrix ComputationsExplore SVD, eigen-decomposition, and their theoretical implicationsAdvanced sections in MML; Recent papers on matrix factorization in DLYouTube seminars by 3Blue1Brown (advanced topics)Implement SVD/PCA from scratch, analyze datasetsShare findings on GitHub and forumsSVD/PCA implementation with analysis report
Neural Network MechanicsUnderstand backpropagation, activation functionsDeep Learning (Ch. 6); Backprop paper (Rumelhart et al., 1986)MIT 6.S191 (Intro to Deep Learning)Code a 3-layer MLP with NumPyEngage in online study groups (Slack/Discord)MLP implementation with ablation study
Regularization & EvaluationAdvanced model tuning: dropout, batch normDL Book (Ch. 7); Dropout paper (Srivastava et al., 2014)CS231n (Training Neural Networks I)Implement dropout/BatchNorm in PyTorchDiscuss results and adjustments on discussion boardsTrained model with regularization analysis
Convolutional Networks (CV)Learn CNN architectures, pooling, stridesDL Book (Ch. 9); AlexNet paper (Krizhevsky et al., 2012)CS231n (CNNs)Build a CNN for CIFAR-10 classificationPost training metrics and troubleshooting tips onlineCNN codebase + accuracy report
Residual Networks (CV)Study skip connections, deep network trainingResNet paper (He et al., 2016)CS231n (Advanced CNN Architectures)Implement ResNet-18 on TinyImageNetCollaborate with peers for design reviewResNet code + training curves
Attention MechanismsExplore self-attention, transformer basicsAttention Is All You Need (Vaswani et al., 2017)CS224n (Transformers)Code a single-head attention layerDiscuss challenges in study groups and on GitHubAttention implementation with visualization
Optimization Deep DiveAdvanced optimizers: AdamW, LAMBAdam paper (Kingma & Ba, 2017); Deep Learning Tuning PlaybookFast.ai (Practical Deep Learning)Benchmark optimizers on a vision taskShare experiment results on GitHub or forumsOptimizer comparison dashboard
VAEs & GANsLatent variable models, adversarial trainingVAE paper (Kingma & Welling, 2013); GAN paper (Goodfellow et al., 2014)CS236 (Deep Generative Models)Train a DCGAN on CelebAEngage in discussions about training challengesGenerated samples + FID score report
Transformers & BERT (NLP)Pre-training, fine-tuning strategiesBERT paper (Devlin et al., 2018); RoBERTa analysis (Liu et al., 2019)Hugging Face NLP CourseFine-tune BERT for sentiment analysisParticipate in online transformer discussionsFine-tuning notebook + accuracy metrics
Diffusion ModelsScore-based generative modelsDDPM paper (Ho et al., 2020); DDIM paper (Song et al., 2021)Stable Diffusion TutorialsImplement a 1D diffusion processEngage with the research community via Twitter/LinkedInDiffusion code + generated samples
Vision Transformers (NLP, CV)Patch embeddings, hybrid architecturesViT paper (Dosovitskiy et al., 2020); DeiT (Touvron et al., 2021)CS231n (Vision Transformers)Train ViT on CIFAR-100Post initial findings on an ML blogViT implementation + ablation study
Paper AnalysisCritique methodology of recent SOTA paperSelect 1 NeurIPS/ICLR 2023-24 paper (e.g., LLaMA, DALL-E 3)Paper author talks on YouTubeReproduce key figures/tablesPost critiques on academic blogs/discussion boardsCritical analysis report
System ImplementationReplicate core model componentsOfficial codebase (if available); Supplementary materialsClone GitHub repositoriesRefactor code for readabilityShare detailed replication notes on a blog or forumClean reimplementation + documentation
Hyperparameter SearchBayesian optimization, multi-objective tuningHyperparameter paper (Snoek et al., 2012); Optuna DocsWeights & Biases TutorialsRun large-scale hyperparameter sweepDiscuss results and adjustments on discussion boardsHyperparameter analysis dashboard
BenchmarkingCompare with baselines, compute metricsML reproducibility checklist (Pineau et al.)MLOps Zoomcamp (Evaluation)Test on 2+ datasets (e.g., ImageNet-1k, COCO)Share experiment results on GitHub or forumsBenchmark report with statistical tests
Choose Track: CV/NLP/RLDeep dive into subfield tools/librariesCV: Detectron2 Docs; NLP: Hugging Face; RL: Stable Baselines3Advanced domain courses (e.g., CS234 for RL)Build an end-to-end project (e.g., object detection pipeline)Engage with mentors and specialized online groupsProject codebase + demo video
Novel Research ProjectFormulate hypothesis, design experimentsLiterature from arXiv (last 6 months)Research group meetings (simulated via Discord)Write preprint-style paperPresent findings in virtual meetupsPublishable manuscript + conference submission
Measure TheoryProbability spaces, Lebesgue integralRoyden Ch. 2-3Read Royden Ch. 2 (3 hrs); Solve Ex. 2.5 (2 hrs)Measure theory notes; Sigma-algebra examples  
Specialization ElectiveChoose: NLP (BERT) or CV (ResNet)BERT (Devlin et al., 2018) / ResNet (He et al., 2015)Implement BERT embedding or ResNet block (5 hrs)Code + performance report  
Portfolio PrepGitHub, summary document-Organize code/docs (3 hrs); Write READMEs (2 hrs)GitHub repo; 1-page research summary  
This post is licensed under CC BY 4.0 by the author.