Jump2Paper Archive

Records 019 Languages 01 Updated 2026.04.07
문서 찾기 제목, 키워드, 약어, 언어별 탐색
문서 001

Agent Lightning: Train ANY AI Agents with Reinforcement Learning

multi-turn GRPOveRLmulti-agent LLMagent frameworksDeepSeek-R1
Korean
문서 002

Automatic Curriculum LearningFor Deep RL: A Short Survey

Multi-Goal RLIntrinsic MotivationPCG for RLSim2Real TransferSelf-Play
Korean
문서 003

Born-Again Neural Networks

Label SmoothingKnowledge DistillationDark KnowledgeKnowledge Distillation SurveySelf-Distillation
Korean
문서 004

Curriculum Learning for LLM Pretraining:An Analysis of Learning Dynamics

curriculum learningLLM pretrainingPythia scalingHMM training trajectorypretraining data mixture
Korean
문서 005

DUMP: Automated Distribution-Level Curriculum Learning for RL-based LLM Post-training

multi-armed bandit분포 샘플링data mixture trainingUCB banditcurriculum learning
Korean
문서 006

Dynamic Loss-Based Sample Reweighting for Improved Large Language Model Pretraining

데이터 효율적 학습Sample Efficiency LLMData Selection for LLMsSlimPajamaDRO Machine Learning
Korean
문서 007

Exclusive Self Attention

Attention MechanismLong ContextSA-FFN 역할 분담Attention SinkTransformer 개선
Korean
문서 008

KV Cache Transform Codingfor Compact Storage in LLM Inference

PCA decorrelationadaptive quantizationlearned transform codingspeculative decodingtransform coding
Korean
문서 009

KVzap: Fast, Adaptive, and Faithful KV Cache Pruning

LLM inference efficiencylong-context LLMKV quantizationsurrogate modelreasoning efficiency
Korean
문서 010

Mamba: Linear-Time Sequence Modeling with Selective State Spaces

Mamba-2S4linear-time sequence modelingstate space modelHyenaDNA
Korean
문서 011

PivotRL: High Accuracy Agentic Post-Training at Low Compute Cost

curriculum learning RLagentic post-trainingBC-to-RL bridgenatural gradient RLSWE-Bench
Korean
문서 012

RegMix: Data Mixture as Regression for Language Model Pre-Training

pre-training data selectiondata mixturegradient boostingDirichlet samplingproxy model methods
Korean
문서 013

Swin Transformer: Hierarchical Vision Transformer using Shifted Windows

Hierarchical Vision TransformerShifted Window AttentionSemantic SegmentationHierarchical Feature MapMasked Image Modeling
Korean
문서 014

The Forward-Forward Algorithm:Some Preliminary Investigations

예측 코딩Noise Contrastive Estimation대조 학습역전파 대안Forward-Forward Algorithm
Korean
문서 015

Think Anywhere in Code Generation

inline reasoninginterleaved thinkingchain-of-thoughtcode generation LLMLoRA
Korean
문서 016

TiKMiX: Take Data Influence into Dynamic Mixture for Language Model Pre-training

domain adaptationcurriculum learningdata mixturedynamic data mixtureproxy-free data selection
Korean
문서 017

Transformers are RNNs:Fast Autoregressive Transformers with Linear Attention

linear attentionTransformer–RNN 등가성kernel self-attentionRetNet선형 언어 모델
Korean
문서 018

TurboQuant: Online Vector Quantization with Near-optimal Distortion Rate

랜덤 회전 양자화온라인 양자화KV cache quantizationvector quantizationonline vector quantization
Korean
문서 019

Very Large-Scale Multi-Agent Simulation in AgentScope

AgentScopeGenerative AgentsDistributed AIGame Theory + LLMLarge-Scale Multi-Agent Simulation
Korean
No papers found matching your criteria.