Taeyoun Kim
I am a second-year MS student in Machine Learning at Carnegie Mellon University, funded by the Kwanjeong Educational Foundation. I am advised by Aditi Raghunathan and also work with Maarten Sap.
I believe that enhancing out-of-distribution / adversarial robustness and improving human alignment of foundation models makes machine learning more accessible to the public. I am interested in understanding when models break out-of-distribution to improve robustness with better pretraining/fine-tuning and high quality data curation. I am also interested in inference-time scaling for better human alignment. The regime of training is saturating but not much is understood about the limits of leveraging compute during inference.
Before CMU, I earned my Bachelor's degree in Electrical & Electronic Engineering from Yonsei University. My undergraduate studies was funded through the National Science and Technology Scholarship of South Korea.
CV /
Google Scholar /
GitHub /
LinkedIn /
Email
|
|
Publications
My recent work is on testing the limits of defenses in jailbreaking, social bias mitigation in retrieval-augmented generation (RAG), and out-of-distribution performance estimation of foundation models using Agreement-on-the-Line.
|
|
Testing the Limits of Jailbreaking Defenses with the Purple Problem
Taeyoun Kim*, Suhas Kotha*, Aditi Raghunathan
NeurIPS Safe GenAI, 2024
arxiv /
code /
The Purple Problem: Can jailbreaking defenses succeed in defending against the simplest definition of preventing the word purple? All defenses we consider fail to enforce the Purple Problem. Moreover, adaptive attacks and increased compute reveal that existing defenses are weaker than reported.
|
|
Predicting the Performance of Foundation Models via Agreement-on-the-Line
Rahul Saxena*, Taeyoun Kim*, Aman Mehra*, Christina Baek, Zico Kolter, Aditi Raghunathan
NeurIPS, 2024
arxiv /
We apply Agreement-on-the-Line to predicting the OOD performance of foundation models. Interestingly, we fine that randomly initializing the linear head for fine-tuning leads to the highest diversity for an ensemble of models to exhibit AGL. This even applies to linear probing.
|
|
The Application of Local Sub-voxel Shifting on Multi-echo GRE-based Myelin Water Imaging
Taeyoun Kim, Muyul Park, Jaeuk Yi, Dong-Hyun Kim
ICMRI (Oral), 2021
We apply local sub-voxel shifting to reduce Gibbs noise in multi-echo GRE-based Myelin Water Imaging. To do this we create a new exponential saddle filter. This removes Gibbs noise while preserving higher image quality without blurring compared to Tukey filtering.
|
Projects
Ongoing and past projects
|
|
Mitigating Social Bias in RAG
Ongoing, 2024
We decompose a RAG system into three components: the LLM, the embedder, and the corpus. Each component can introduce bias its own bias in the RAG system accumulating complex bias. We find that it is possible to mitigate bias just by reverse biasing the embedder. Furthermore, we empirically find a linear relationship between the embedder’s bias and RAG system’s bias which has varying sensitivity for different LLMs. We investigate the three different methods of fine-tuning, projecting, and stochastic ranking to mitigate bias and fine that fine-tuning maintains utility while reducing bias. We also find that a reverse-biased embedder makes the entire RAG system robust to variations in corpus bias.
|
|
Generalizing Point-and-Click Behavior to Vision
Ongoing, 2022
We model human point-and-click behavior with Soft-Actor Critic to understand human motor control within the BUMP process. We generalize point-and-click to perform any visual task as long as a target region and avoidance region is provided.
|
|
Impact of Different Joints on Creating a 3D Hand Mesh
Taeyoun Kim*, Hoseok Tong*, Jinoh Lee*
Undergraduate Thesis, 2022
We use a PointNet to reconstruct hand meshes from 26 hand joints extracted from Microsoft Hololens 2. We study the impact of the fingertips and metacarpals to show that fingertips are crucial for optimization of the network.
|
|