Hello，guys !

#career #learning #programming #python

My Name is Npc (Although it's just a nickname, it's pretty much the same in real life.)

Position: Python Engineer
From: China！！
Age: 24 (2001-12-25)
Education: Associate Degree
Major: Big Data Technology (Courses: Java, Python, Apache Hadoop, Scala, Spark, HBase, Hive, Linux, Flink, MySQL, Redis)

Current Technical Stack

python, pytorch, transformers, golang, mysql, redis, pgsql, mongodb, elasticsearch, clickhouse

Awards

First Prize in Provincial Big Data Skills Competition (Associate Degree Level)

Hobbies

FPV Drones, Fixed-wing Model Aircraft, Programming.

University Learning Experience:

Year 1: Confused and blank, little social experience, loved gaming, 70% course absorption rate, no self-directed learning habits outside class.
Year 2: Sudden growth, mindset restructured, clearly recognized this was my last usable time, formulated (Python) learning plan, surpassed school curriculum the same year.
Year 3: Pandemic arrived, studied from home, used self-taught Python web scraping knowledge to acquire data, learning drive skyrocketed, advanced to self-directed learning of big data analysis content (Java, Scala, Spark, HBase, Hive, Flink), surpassed all classmates the same year.

Career Experience:

Phase 1: Interned in Shanghai from Guangdong, learned mainstream big data analysis projects (Spark-type), worked on the front line contacting massive industry data analysis demands, self-study research continued, pandemic worsened, internship ended early in less than 1 year.
Phase 2: Big data technology heat declined, SMEs turned to cooperate with large data service providers to reduce costs, entered large data service provider in Shenzhen, Guangdong (Python data scraping and analysis type), self-study research continued but already understood industry big data status (SMEs don't need big data | Data service providers rising | Data more important than technology | Positions shrinking, SME position survival rate low)
Phase 3: Demand continued to decrease -> Big data technology became auxiliary, effectively transitioned to API data service interface docking as main work, same year jumped to tourism industry technical service provider, big data technology + data scraping analysis experience advantage emerged, brought innovation, focused on data collection + hotel business data analysis aggregation algorithms (traditional engineering algorithms).
Phase 4: Data collection + hotel business data analysis business demand continued to rise, simultaneously API data service interface auxiliary increased, finally remaining continuous learning technical stack: (python, golang, mysql, redis, pgsql, mongodb, elasticsearch, clickhouse).
Phase 5: AI explosion (ChatGPT debuted), first batch of experience users, once again developed strong interest in emerging sector, deep research and exploration.
Phase 6 (Now):
1. Researching current hotel physical information aggregation algorithm transformation to AI, deeply understanding AI development history (CUDA/RNN/CNN/TensorFlow/Torch/Transformers), cutting in from code layer to understand PyTorch & Transformers (decoder-only/encoder-only/decoder-encoder)
2. Attempted to access hotel physical information aggregation business from AI application layer, generalization ability improved but accuracy uncontrollable, optimized prompts -> still uncontrollable, essential reasons (attention problems | hallucinations | generative models and clustering tasks not compatible | high cost slow speed | knowledge base cannot guarantee consistent retrieval results each time | uncertain results inherently incompatible with this business (unacceptable error rate))
3. Abandoned AI application layer API, made progress from encoder-only layer (vertical domain fine-tuning) experiments, (effective without hallucinations | 10,000 retry results consistently correct | generalization ability acceptable (compared to previous engineering algorithms, significant improvement) | fast speed low cost (actually slow and high cost compared to previous engineering algorithms, but for LLM it's already very fast and cheap))
4. Encoder-only vertical domain fine-tuning continues, simultaneously continuing deep learning of decoder-only principles (understanding why this works? why this doesn't work?), rather than understanding surface-level principles from existing (finished APIs/open-source models), I feel I need to understand more thoroughly from the bottom layer, even if theoretical knowledge occupies 70% and code knowledge occupies 30%, I believe this is a very efficient learning ratio, this is a learning method that lives and dies with AI, organized sufficient theoretical knowledge can let AI generate code knowledge, we then modify and experiment to achieve final requirements (this is not me over-relying on AI, if AI dies, then this knowledge dies too, of course I believe this conceptual knowledge exists forever and is great), so I call this progressive learning, different from previous learning methods.
5. Continuously updating (those who lie usually forget)....