Codenext - software engineer (python)
VernierHotel Du Parc
...the LLM development lifecycle—from data processing and model training to benchmarking. Our approach leverages supervised fine‑tuning (SFT), direct preference optimization (DPO), reinforcement learning from human feedback (RLHF), and retrieval‑augmented generation (RAG). Responsibilities Develop and maintain [...]
Kategorie Medien / Verlag / Redaktion