Arxiv Daily

Index	Arxiv ID	Title	First Author	Submit Date	Abstract	PDF Links	Github Code	Paper With Code
0	2309.14737v2	Volumetric Semantically Consistent 3D Panoptic Mapping	Yang Miao	2023-09-26	Show Abstract	PDF		Link
1	2403.02175v2	LiSTA: Geometric Object-Based Change Detection in Cluttered Environments	Joseph Rowell	2024-03-04	Show Abstract	PDF
2	2309.15065v2	Language-EXtended Indoor SLAM (LEXIS): A Versatile System for Real-time Visual Scene Understanding	Christina Kassab	2023-09-26	Show Abstract	PDF
3	2403.02280v1	Tightly-Coupled LiDAR-Visual-Inertial SLAM and Large-Scale Volumetric Occupancy Mapping	Simon Boche	2024-03-04	Show Abstract	PDF
4	2403.02235v1	Structure from WiFi (SfW): RSSI-based Geometric Mapping of Indoor Environments	Junseo Kim	2024-03-04	Show Abstract	PDF
5	2309.06635v3	Collaborative Dynamic 3D Scene Graphs for Automated Driving	Elias Greve	2023-09-12	Show Abstract	PDF	Link	Link
6	2309.10314v4	Dive Deeper into Rectifying Homography for Stereo Camera Online Self-Calibration	Hongbo Zhao	2023-09-19	Show Abstract	PDF
7	2402.03246v3	SGS-SLAM: Semantic Gaussian Splatting For Neural Dense SLAM	Mingrui Li	2024-02-05	Show Abstract	PDF
8	2403.01110v1	Grid-based Fast and Structural Visual Odometry	Zhang Zhihe	2024-03-02	Show Abstract	PDF
9	2306.14137v2	BotanicGarden: A High-Quality Dataset for Robot Navigation in Unstructured Natural Environments	Yuanzhi Liu	2023-06-25	Show Abstract	PDF	Link
10	2403.00976v1	Joint Spatial-Temporal Calibration for Camera and Global Pose Sensor	Junlin Song	2024-03-01	Show Abstract	PDF
11	2402.14591v2	High-Speed Detector For Low-Powered Devices In Aerial Grasping	Ashish Kumar	2024-02-22	Show Abstract	PDF
12	2309.10225v2	VPRTempo: A Fast Temporally Encoded Spiking Neural Network for Visual Place Recognition	Adam D. Hines	2023-09-19	Show Abstract	PDF		Link
13	2403.00228v1	DISORF: A Distributed Online NeRF Training and Rendering Framework for Mobile Robots	Chunlin Li	2024-03-01	Show Abstract	PDF
14	2309.05134v2	Benchmarking ground truth trajectories with robotic total stations	Effie Daum	2023-09-10	Show Abstract	PDF
15	2402.18771v1	NARUTO: Neural Active Reconstruction from Uncertain Target Observations	Ziyue Feng	2024-02-29	Show Abstract	PDF
16	2402.18318v1	SD-SLAM: A Semantic SLAM Approach for Dynamic Scenes Based on LiDAR Point Clouds	Feiya Li	2024-02-28	Show Abstract	PDF
17	2402.18174v1	Generation of skill-specific maps from graph world models for robotic systems	Koen de Vos	2024-02-28	Show Abstract	PDF
18	2402.03762v4	MoD-SLAM: Monocular Dense Mapping for Unbounded 3D Scene Reconstruction	Heng Zhou	2024-02-06	Show Abstract	PDF
19	2402.13609v2	VOOM: Robust Visual Object Odometry and Mapping using Hierarchical Landmarks	Yutong Wang	2024-02-21	Show Abstract	PDF		Link
20	2402.16082v1	Modeling Point Uncertainty in Radar SLAM	Yang Xu	2024-02-25	Show Abstract	PDF
21	2402.14345v2	An Error-Matching Exclusion Method for Accelerating Visual SLAM	Shaojie Zhang	2024-02-22	Show Abstract	PDF
22	2402.13488v2	A Feature Matching Method Based on Multi-Level Refinement Strategy	Shaojie Zhang	2024-02-21	Show Abstract	PDF
23	2402.15961v1	VOLoc: Visual Place Recognition by Querying Compressed Lidar Map	Xudong Cai	2024-02-25	Show Abstract	PDF		Link
24	2402.11790v2	CoLRIO: LiDAR-Ranging-Inertial Centralized State Estimation for Robotic Swarms	Shipeng Zhong	2024-02-19	Show Abstract	PDF		Link
25	2402.14308v1	Ground-Fusion: A Low-cost Ground SLAM System Robust to Corner Cases	Jie Yin	2024-02-22	Show Abstract	PDF
26	2402.14280v1	Secure Navigation using Landmark-based Localization in a GPS-denied Environment	Ganesh Sapkota	2024-02-22	Show Abstract	PDF
27	2402.13817v1	Khronos: A Unified Approach for Spatio-Temporal Metric-Semantic SLAM in Dynamic Environments	Lukas Schmid	2024-02-21	Show Abstract	PDF		Link
28	2402.13537v1	EffLoc: Lightweight Vision Transformer for Efficient 6-DOF Camera Relocalization	Zhendong Xiao	2024-02-21	Show Abstract	PDF
29	2402.13255v1	How NeRFs and 3D Gaussian Splatting are Reshaping SLAM: a Survey	Fabio Tosi	2024-02-20	Show Abstract	PDF
30	2402.07429v2	Particle Filter SLAM for Vehicle Localization	Tianrui Liu	2024-02-12	Show Abstract	PDF
31	2402.12551v1	Landmark-based Localization using Stereo Vision and Deep Learning in GPS-Denied Battlefield Environment	Ganesh Sapkota	2024-02-19	Show Abstract	PDF
32	2402.12149v1	MLFEF: Machine Learning Fusion Model with Empirical Formula to Explore the Momentum in Competitive Sports	Ruixin Peng	2024-02-19	Show Abstract	PDF
33	2402.11680v1	3D Point Cloud Compression with Recurrent Neural Network and Image Compression Methods	Till Beemelmanns	2024-02-18	Show Abstract	PDF		Link
34	2402.09944v1	Loopy-SLAM: Dense Neural SLAM with Loop Closures	Lorenzo Liso	2024-02-14	Show Abstract	PDF
35	2402.08897v1	RB5 Low-Cost Explorer: Implementing Autonomous Long-Term Exploration on Low-Cost Robotic Hardware	Adam Seewald	2024-02-14	Show Abstract	PDF		Link
36	2402.08846v1	An Embarrassingly Simple Approach for LLM with Strong ASR Capacity	Ziyang Ma	2024-02-13	Show Abstract	PDF
37	2309.06950v3	3D Active Metric-Semantic SLAM	Yuezhan Tao	2023-09-13	Show Abstract	PDF
38	2402.08125v1	Customizable Perturbation Synthesis for Robust SLAM Benchmarking	Xiaohao Xu	2024-02-12	Show Abstract	PDF		Link
39	2402.07537v1	UAV-assisted Visual SLAM Generating Reconstructed 3D Scene Graphs in GPS-denied Environments	Ahmed Radwan	2024-02-12	Show Abstract	PDF
40	2402.06951v1	Semantic Object-level Modeling for Robust Visual Camera Relocalization	Yifan Zhu	2024-02-10	Show Abstract	PDF
41	2212.04745v3	SLAM for Visually Impaired People: a Survey	Marziyeh Bamdad	2022-12-09	Show Abstract	PDF
42	2402.06131v1	PAS-SLAM: A Visual SLAM System for Planar Ambiguous Scenes	Xinggang Hu	2024-02-09	Show Abstract	PDF
43	2309.14063v3	Preferential Multi-Target Search in Indoor Environments using Semantic SLAM	Akash Chikhalikar	2023-09-25	Show Abstract	PDF
44	2402.05254v1	Online and Certifiably Correct Visual Odometry and Mapping	Devansh R Agrawal	2024-02-07	Show Abstract	PDF
45	2402.05003v1	Efficient Invariant Kalman Filter for Inertial-based Odometry with Large-sample Environmental Measurements	Xinghan Li	2024-02-07	Show Abstract	PDF		Link
46	2309.14641v2	Adaptive Denoising-Enhanced LiDAR Odometry for Degeneration Resilience in Diverse Terrains	Mazeyu Ji	2023-09-26	Show Abstract	PDF

Index	Arxiv ID	Title	First Author	Submit Date	Abstract	PDF Links	Github Code	Paper With Code
0	2402.11431v2	A Robust Error-Resistant View Selection Method for 3D Reconstruction	Shaojie Zhang	2024-02-18	Show Abstract	PDF
1	2402.14650v1	GaussianPro: 3D Gaussian Splatting with Progressive Propagation	Kai Cheng	2024-02-22	Show Abstract	PDF
2	2402.12025v1	Speech Translation with Speech Foundation Models and Large Language Models: What is There and What is Missing?	Marco Gaido	2024-02-19	Show Abstract	PDF
3	2402.11287v1	Dense Matchers for Dense Tracking	Tomáš Jelínek	2024-02-17	Show Abstract	PDF
4	2209.03910v2	PixTrack: Precise 6DoF Object Pose Tracking using NeRF Templates and Feature-metric Alignment	Prajwal Chidananda	2022-09-08	Show Abstract	PDF		Link
5	2309.11883v2	On-the-Fly SfM: What you capture is What you get	Zongqian Zhan	2023-09-21	Show Abstract	PDF		Link
6	2311.17245v4	LightGaussian: Unbounded 3D Gaussian Compression with 15x Reduction and 200+ FPS	Zhiwen Fan	2023-11-28	Show Abstract	PDF		Link
7	2312.10109v2	Enlighten-Your-Voice: When Multimodal Meets Zero-shot Low-light Image Enhancement	Xiaofeng Zhang	2023-12-15	Show Abstract	PDF		Link
8	2401.17592v1	Local Feature Matching Using Deep Learning: A Survey	Shibiao Xu	2024-01-31	Show Abstract	PDF
9	2304.07250v3	Fusing Structure from Motion and Simulation-Augmented Pose Regression from Optical Flow for Challenging Indoor Environments	Felix Ott	2023-04-14	Show Abstract	PDF
10	2306.15667v4	PoseDiffusion: Solving Pose Estimation via Diffusion-aided Bundle Adjustment	Jianyuan Wang	2023-06-27	Show Abstract	PDF
11	2401.14289v1	Speech foundation models on intelligibility prediction for hearing-impaired listeners	Santiago Cuervo	2024-01-24	Show Abstract	PDF
12	2401.11711v1	HG3-NeRF: Hierarchical Geometric, Semantic, and Photometric Guided Neural Radiance Fields for Sparse View Inputs	Zelin Gao	2024-01-22	Show Abstract	PDF
13	2401.10886v1	SCENES: Subpixel Correspondence Estimation With Epipolar Supervision	Dominik A. Kloepfer	2024-01-19	Show Abstract	PDF
14	2401.09252v1	3D Scene Geometry Estimation from 360$^\circ$ Imagery: A Survey	Thiago Lopes Trugillo da Silveira	2024-01-17	Show Abstract	PDF
15	2401.08937v1	ICON: Incremental CONfidence for Joint Pose and Radiance Field Optimization	Weiyao Wang	2024-01-17	Show Abstract	PDF
16	2401.08043v1	Cross-Modal Semi-Dense 6-DoF Tracking of an Event Camera in Challenging Conditions	Yi-Fan Zuo	2024-01-16	Show Abstract	PDF		Link
17	2301.08422v3	A vision-based autonomous UAV inspection framework for unknown tunnel construction sites with dynamic obstacles	Zhefan Xu	2023-01-20	Show Abstract	PDF		Link
18	2304.03930v2	Photometric Correction for Infrared Sensors	Jincheng Zhang	2023-04-08	Show Abstract	PDF
19	2401.05236v1	Structure from Duplicates: Neural Inverse Graphics from a Pile of Objects	Tianhang Cheng	2024-01-10	Show Abstract	PDF		Link
20	2401.03450v1	A Classification of Critical Configurations for any Number of Projective Views	Martin Bråtelund	2024-01-07	Show Abstract	PDF		Link
21	2312.11153v2	Research on Multilingual Natural Scene Text Detection Algorithm	Tao Wang	2023-12-18	Show Abstract	PDF
22	2306.09012v3	Yes, we CANN: Constrained Approximate Nearest Neighbors for local feature-based visual localization	Dror Aiger	2023-06-15	Show Abstract	PDF		Link
23	2310.03704v3	Pose-Free Generalizable Rendering Transformer	Zhiwen Fan	2023-10-05	Show Abstract	PDF
24	2312.15471v1	Residual Learning for Image Point Descriptors	Rashik Shrestha	2023-12-24	Show Abstract	PDF
25	2312.13977v2	NeuSurf: On-Surface Priors for Neural Surface Reconstruction from Sparse Input Views	Han Huang	2023-12-21	Show Abstract	PDF
26	2312.10529v1	Transformers in Unsupervised Structure-from-Motion	Hemang Chawla	2023-12-16	Show Abstract	PDF		Link
27	2312.08863v1	HeadRecon: High-Fidelity 3D Head Reconstruction from Monocular Video	Xueying Wang	2023-12-14	Show Abstract	PDF
28	2312.08760v1	CF-NeRF: Camera Parameter Free Neural Radiance Fields with Incremental Learning	Qingsong Yan	2023-12-14	Show Abstract	PDF
29	2312.07504v1	COLMAP-Free 3D Gaussian Splatting	Yang Fu	2023-12-12	Show Abstract	PDF
30	2312.06865v1	Keypoint-based Stereophotoclinometry for Characterizing and Navigating Small Bodies: A Factor Graph Approach	Travis Driver	2023-12-11	Show Abstract	PDF
31	2312.06741v1	Gaussian Splatting SLAM	Hidenobu Matsuki	2023-12-11	Show Abstract	PDF
32	2308.08479v3	DeDoDe: Detect, Don't Describe -- Describe, Don't Detect for Local Feature Matching	Johan Edstedt	2023-08-16	Show Abstract	PDF		Link
33	2312.05889v1	SuperPrimitive: Scene Reconstruction at a Primitive Level	Kirill Mazur	2023-12-10	Show Abstract	PDF
34	2312.04563v1	Visual Geometry Grounded Deep Structure From Motion	Jianyuan Wang	2023-12-07	Show Abstract	PDF
35	2308.15984v2	Learning Structure-from-Motion with Graph Attention Networks	Lucas Brynte	2023-08-30	Show Abstract	PDF
36	2312.00451v1	FSGS: Real-Time Few-shot View Synthesis using Gaussian Splatting	Zehao Zhu	2023-12-01	Show Abstract	PDF	Link
37	2311.18801v1	Distributed Global Structure-from-Motion with a Deep Front-End	Ayush Baid	2023-11-30	Show Abstract	PDF		Link
38	2307.11702v3	SACReg: Scene-Agnostic Coordinate Regression for Visual Localization	Jerome Revaud	2023-07-21	Show Abstract	PDF
39	2311.11808v2	Robot Hand-Eye Calibration using Structure-from-Motion	Nicolas Andreff	2023-11-20	Show Abstract	PDF

Index	Arxiv ID	Title	First Author	Submit Date	Abstract	PDF Links	Paper With Code
0	2403.03218v1	The WMDP Benchmark: Measuring and Reducing Malicious Use With Unlearning	Nathaniel Li	2024-03-05	Show Abstract	PDF
1	2403.03203v1	CLEVR-POC: Reasoning-Intensive Visual Question Answering in Partially Observable Environments	Savitha Sam Abraham	2024-03-05	Show Abstract	PDF
2	2403.01548v2	In-Context Sharpness as Alerts: An Inner Representation Perspective for Hallucination Mitigation	Shiqi Chen	2024-03-03	Show Abstract	PDF	Link
3	2403.03194v1	MAGID: An Automated Pipeline for Generating Synthetic Multi-modal Datasets	Hossein Aboutalebi	2024-03-05	Show Abstract	PDF
4	2403.01777v2	NPHardEval4V: A Dynamic Reasoning Benchmark of Multimodal Large Language Models	Lizhou Fan	2024-03-04	Show Abstract	PDF
5	2403.03188v1	Towards Democratized Flood Risk Management: An Advanced AI Assistant Enabled by GPT-4 for Enhanced Interpretability and Public Engagement	Rafaela Martelo	2024-03-05	Show Abstract	PDF	Link
6	2403.01616v2	Towards Comprehensive Vietnamese Retrieval-Augmented Generation and Large Language Models	Nguyen Quang Duc	2024-03-03	Show Abstract	PDF
7	2310.00194v2	A Prefrontal Cortex-inspired Architecture for Planning in Large Language Models	Taylor Webb	2023-09-30	Show Abstract	PDF	Link
8	2403.03170v1	SNIFFER: Multimodal Large Language Model for Explainable Out-of-Context Misinformation Detection	Peng Qi	2024-03-05	Show Abstract	PDF
9	2403.03167v1	PARADISE: Evaluating Implicit Planning Skills of Language Models with Procedural Warnings and Tips Dataset	Arda Uzunoğlu	2024-03-05	Show Abstract	PDF	Link
10	2403.03163v1	Design2Code: How Far Are We From Automating Front-End Engineering?	Chenglei Si	2024-03-05	Show Abstract	PDF
11	2306.04735v2	Soft-prompt Tuning for Large Language Models to Evaluate Bias	Jacob-Junqi Tian	2023-06-07	Show Abstract	PDF
12	2403.03141v1	Language Guided Exploration for RL Agents in Text Environments	Hitesh Golchha	2024-03-05	Show Abstract	PDF
13	2305.14342v4	Sophia: A Scalable Stochastic Second-order Optimizer for Language Model Pre-training	Hong Liu	2023-05-23	Show Abstract	PDF	Link
14	2403.03121v1	Angry Men, Sad Women: Large Language Models Reflect Gendered Stereotypes in Emotion Attribution	Flor Miriam Plaza-del-Arco	2024-03-05	Show Abstract	PDF
15	2403.03102v1	"In Dialogues We Learn": Towards Personalized Dialogue Without Pre-defined Profiles through In-Dialogue Learning	Chuanqi Cheng	2024-03-05	Show Abstract	PDF
16	2403.03101v1	KnowAgent: Knowledge-Augmented Planning for LLM-Based Agents	Yuqi Zhu	2024-03-05	Show Abstract	PDF	Link
17	2305.14824v3	Mitigating Temporal Misalignment by Discarding Outdated Facts	Michael J. Q. Zhang	2023-05-24	Show Abstract	PDF	Link
18	2403.03031v1	Learning to Use Tools via Cooperative and Interactive Agents	Zhengliang Shi	2024-03-05	Show Abstract	PDF
19	2309.15065v2	Language-EXtended Indoor SLAM (LEXIS): A Versatile System for Real-time Visual Scene Understanding	Christina Kassab	2023-09-26	Show Abstract	PDF
20	2403.03029v1	Socratic Reasoning Improves Positive Text Rewriting	Anmol Goel	2024-03-05	Show Abstract	PDF
21	2403.03028v1	Word Importance Explains How Prompts Affect Language Model Outputs	Stefan Hackmann	2024-03-05	Show Abstract	PDF
22	2403.03017v1	OPEx: A Component-Wise Analysis of LLM-Centric Agents in Embodied Instruction Following	Haochen Shi	2024-03-05	Show Abstract	PDF
23	2403.03008v1	Knowledge Graphs as Context Sources for LLM-Based Explanations of Learning Recommendations	Hasan Abu-Rasheed	2024-03-05	Show Abstract	PDF
24	2401.06071v5	GroundingGPT:Language Enhanced Multi-modal Grounding Model	Zhaowei Li	2024-01-11	Show Abstract	PDF	Link
25	2403.03003v1	Feast Your Eyes: Mixture-of-Resolution Adaptation for Multimodal Large Language Models	Gen Luo	2024-03-05	Show Abstract	PDF	Link
26	2403.00818v2	DenseMamba: State Space Models with Dense Hidden Connection for Efficient Large Language Models	Wei He	2024-02-26	Show Abstract	PDF	Link
27	2403.02993v1	Localized Zeroth-Order Prompt Optimization	Wenyang Hu	2024-03-05	Show Abstract	PDF
28	2403.02990v1	Data Augmentation using LLMs: Data Perspectives, Learning Paradigms and Challenges	Bosheng Ding	2024-03-05	Show Abstract	PDF
29	2403.00867v2	Gradient Cuff: Detecting Jailbreak Attacks on Large Language Models by Exploring Refusal Loss Landscapes	Xiaomeng Hu	2024-03-01	Show Abstract	PDF
30	2403.02969v1	Multi-modal Instruction Tuned LLMs with Fine-grained Visual Perception	Junwen He	2024-03-05	Show Abstract	PDF
31	2403.02966v1	Evidence-Focused Fact Summarization for Knowledge-Augmented Zero-Shot Question Answering	Sungho Ko	2024-03-05	Show Abstract	PDF
32	2403.00884v2	Text classification of column headers with a controlled vocabulary: leveraging LLMs for metadata enrichment	Margherita Martorana	2024-03-01	Show Abstract	PDF
33	2403.02965v1	ChatGPT and biometrics: an assessment of face recognition, gender detection, and age estimation capabilities	Ahmad Hassanpour	2024-03-05	Show Abstract	PDF
34	2403.02962v1	WikiTableEdit: A Benchmark for Table Editing by Natural Language Instruction	Zheng Li	2024-03-05	Show Abstract	PDF
35	2403.02959v1	SimuCourt: Building Judicial Decision-Making Agents with Real-world Judgement Documents	Zhitao He	2024-03-05	Show Abstract	PDF	Link
36	2312.09979v3	LoRAMoE: Alleviate World Knowledge Forgetting in Large Language Models via MoE-Style Plugin	Shihan Dou	2023-12-15	Show Abstract	PDF
37	2403.02951v1	Benchmarking the Text-to-SQL Capability of Large Language Models: A Comprehensive Evaluation	Bin Zhang	2024-03-05	Show Abstract	PDF
38	2312.16044v4	LLMLight: Large Language Models as Traffic Signal Control Agents	Siqi Lai	2023-12-26	Show Abstract	PDF	Link
39	2403.02939v1	PaperWeaver: Enriching Topical Paper Alerts by Contextualizing Recommended Papers with User-collected Papers	Yoonjoo Lee	2024-03-05	Show Abstract	PDF
40	2402.18180v3	Human Simulacra: A Step toward the Personification of Large Language Models	Qiuejie Xie	2024-02-28	Show Abstract	PDF
41	2308.06354v2	Large Language Models to Identify Social Determinants of Health in Electronic Health Records	Marco Guevara	2023-08-11	Show Abstract	PDF	Link
42	2403.02910v1	ImgTrojan: Jailbreaking Vision-Language Models with ONE Image	Xijia Tao	2024-03-05	Show Abstract	PDF
43	2402.18240v2	Prospect Personalized Recommendation on Large Language Model-based Agent Platform	Jizhi Zhang	2024-02-28	Show Abstract	PDF	Link
44	2403.02901v1	A Comprehensive Survey on Process-Oriented Automatic Text Summarization with Exploration of LLM-Based Methods	Hanlei Jin	2024-03-05	Show Abstract	PDF
45	2401.06509v3	AntEval: Evaluation of Social Interaction Competencies in LLM-Driven Agents	Yuanzhi Liang	2024-01-12	Show Abstract	PDF
46	2403.02889v1	In Search of Truth: An Interrogation Approach to Hallucination Detection	Yakir Yehuda	2024-03-05	Show Abstract	PDF
47	2403.02884v1	MathScale: Scaling Instruction Tuning for Mathematical Reasoning	Zhengyang Tang	2024-03-05	Show Abstract	PDF

Index	Arxiv ID	Title	First Author	Submit Date	Abstract	PDF Links	Github Code	Paper With Code
0	2403.03203v1	CLEVR-POC: Reasoning-Intensive Visual Question Answering in Partially Observable Environments	Savitha Sam Abraham	2024-03-05	Show Abstract	PDF
1	2403.03134v1	Simplicity in Complexity	Kevin Shen	2024-03-05	Show Abstract	PDF
2	2403.02164v2	Cognition is All You Need -- The Next Layer of AI Above Large Language Models	Nova Spivack	2024-03-04	Show Abstract	PDF
3	2403.02752v1	HINTs: Sensemaking on large collections of documents with Hypergraph visualization and INTelligent agents	Sam Yu-Te Lee	2024-03-05	Show Abstract	PDF
4	2403.02571v1	DPAdapter: Improving Differentially Private Deep Learning through Noise Tolerance Pre-training	Zihao Wang	2024-03-05	Show Abstract	PDF
5	2403.02522v1	HeAR -- Health Acoustic Representations	Sebastien Baur	2024-03-04	Show Abstract	PDF
6	2403.01699v1	Brilla AI: AI Contestant for the National Science and Maths Quiz	George Boateng	2024-03-04	Show Abstract	PDF
7	2403.01626v1	Using LLMs for Tabletop Exercises within the Security Domain	Sam Hays	2024-03-03	Show Abstract	PDF
8	2403.01476v1	CCC: Color Classified Colorization	Mrityunjoy Gain	2024-03-03	Show Abstract	PDF
9	2403.01418v1	A Simple-but-effective Baseline for Training-free Class-Agnostic Counting	Yuhao Lin	2024-03-03	Show Abstract	PDF
10	2311.05112v4	A Survey of Large Language Models in Medicine: Progress, Application, and Challenge	Hongjian Zhou	2023-11-09	Show Abstract	PDF		Link
11	2403.01323v1	A non-cubic space-filling modular robot	Tyler Hummer	2024-03-02	Show Abstract	PDF
12	2403.01271v1	Employing LLMs for Incident Response Planning and Review	Sam Hays	2024-03-02	Show Abstract	PDF
13	2401.05638v2	MatSAM: Efficient Extraction of Microstructures of Materials via Visual Large Model	Changtai Li	2024-01-11	Show Abstract	PDF		Link
14	2402.16338v3	BLO-SAM: Bi-level Optimization Based Overfitting-Preventing Finetuning of SAM	Li Zhang	2024-02-26	Show Abstract	PDF
15	2310.05670v2	Reinforcement learning for freeform robot design	Muhan Li	2023-10-09	Show Abstract	PDF
16	2303.08774v5	GPT-4 Technical Report	OpenAI	2023-03-15	Show Abstract	PDF		Link
17	2303.18242v2	$\infty$-Diff: Infinite Resolution Diffusion with Subsampled Mollified States	Sam Bond-Taylor	2023-03-31	Show Abstract	PDF		Link
18	2403.00574v1	Beyond Single-Model Views for Deep Learning: Optimization versus Generalizability of Stochastic Optimization Algorithms	Toki Tahmid Inan	2024-03-01	Show Abstract	PDF
19	2403.00334v1	NOVA: A visual interface for assessing polarizing media coverage	Keshav Dasu	2024-03-01	Show Abstract	PDF
20	2402.04140v3	Advancing Legal Reasoning: The Integration of AI to Navigate Complexities and Biases in Global Jurisprudence with Semi-Automated Arbitration Processes (SAAPs)	Michael De'Shazer	2024-02-06	Show Abstract	PDF
21	2402.19145v1	A SAM-guided Two-stream Lightweight Model for Anomaly Detection	Chenghao Li	2024-02-29	Show Abstract	PDF		Link
22	2402.19102v1	FlatNAS: optimizing Flatness in Neural Architecture Search for Out-of-Distribution Robustness	Matteo Gambella	2024-02-29	Show Abstract	PDF
23	2402.19004v1	RSAM-Seg: A SAM-based Approach with Prior Knowledge Integration for Remote Sensing Image Semantic Segmentation	Jie Zhang	2024-02-29	Show Abstract	PDF
24	2312.05760v2	RepViT-SAM: Towards Real-Time Segmenting Anything	Ao Wang	2023-12-10	Show Abstract	PDF	Link	Link
25	2307.09283v7	RepViT: Revisiting Mobile CNN From ViT Perspective	Ao Wang	2023-07-18	Show Abstract	PDF	Link	Link
26	2402.18728v1	Not All the Same: Understanding and Informing Similarity Estimation in Tile-Based Video Games	Sebastian Berns	2024-02-28	Show Abstract	PDF	Link	Link
27	2402.18659v1	Large Language Models and Games: A Survey and Roadmap	Roberto Gallotta	2024-02-28	Show Abstract	PDF
28	2402.18204v1	ConvDTW-ACS: Audio Segmentation for Track Type Detection During Car Manufacturing	Álvaro López-Chilet	2024-02-28	Show Abstract	PDF
29	2309.00655v4	RigNet++: Semantic Assisted Repetitive Image Guided Network for Depth Completion	Zhiqiang Yan	2023-09-01	Show Abstract	PDF
30	2310.10010v2	Black-box Targeted Adversarial Attack on Segment Anything (SAM)	Sheng Zheng	2023-10-16	Show Abstract	PDF
31	2402.17972v1	From Generalization to Precision: Exploring SAM for Tool Segmentation in Surgical Environments	Kanyifeechukwu J. Oguine	2024-02-28	Show Abstract	PDF
32	2311.02189v3	FairSeg: A Large-Scale Medical Image Segmentation Dataset for Fairness Learning Using Segment Anything Model with Fair Error-Bound Scaling	Yu Tian	2023-11-03	Show Abstract	PDF		Link

Index	Arxiv ID	Title	First Author	Submit Date	Abstract	PDF Links	Paper With Code
0	2403.03203v1	CLEVR-POC: Reasoning-Intensive Visual Question Answering in Partially Observable Environments	Savitha Sam Abraham	2024-03-05	Show Abstract	PDF
1	2403.03145v1	Dual Mean-Teacher: An Unbiased Semi-Supervised Framework for Audio-Visual Source Localization	Yuxin Guo	2024-03-05	Show Abstract	PDF	Link
2	2403.03134v1	Simplicity in Complexity	Kevin Shen	2024-03-05	Show Abstract	PDF
3	2305.14342v4	Sophia: A Scalable Stochastic Second-order Optimizer for Language Model Pre-training	Hong Liu	2023-05-23	Show Abstract	PDF	Link
4	2309.15065v2	Language-EXtended Indoor SLAM (LEXIS): A Versatile System for Real-time Visual Scene Understanding	Christina Kassab	2023-09-26	Show Abstract	PDF
5	2403.02991v1	MADTP: Multimodal Alignment-Guided Dynamic Token Pruning for Accelerating Vision-Language Transformer	Jianjian Cao	2024-03-05	Show Abstract	PDF	Link
6	2403.02781v1	PromptKD: Unsupervised Prompt Distillation for Vision-Language Models	Zheng Li	2024-03-05	Show Abstract	PDF	Link
7	2403.02714v1	DomainVerse: A Benchmark Towards Real-World Distribution Shifts For Tuning-Free Adaptive Domain Generalization	Feng Hou	2024-03-05	Show Abstract	PDF
8	2403.02677v1	Finetuned Multimodal Language Models Are High-Quality Image-Text Data Filters	Weizhi Wang	2024-03-05	Show Abstract	PDF
9	2403.02626v1	Modeling Collaborator: Enabling Subjective Vision Classification With Minimal Human Effort via LLM Tool-Use	Imad Eddine Toubal	2024-03-05	Show Abstract	PDF
10	2403.02580v1	What do we learn from inverting CLIP models?	Hamid Kazemi	2024-03-05	Show Abstract	PDF	Link
11	2403.02522v1	HeAR -- Health Acoustic Representations	Sebastien Baur	2024-03-04	Show Abstract	PDF
12	2311.12075v3	BadCLIP: Dual-Embedding Guided Backdoor Attack on Multimodal Contrastive Learning	Siyuan Liang	2023-11-20	Show Abstract	PDF
13	2403.02041v1	A Generative Approach for Wikipedia-Scale Visual Entity Recognition	Mathilde Caron	2024-03-04	Show Abstract	PDF
14	2310.06836v2	What Does Stable Diffusion Know about the 3D Scene?	Guanqi Zhan	2023-10-10	Show Abstract	PDF	Link
15	2403.01849v1	One Prompt Word is Enough to Boost Adversarial Robustness for Pre-trained Vision-Language Models	Lin Li	2024-03-04	Show Abstract	PDF	Link
16	2403.01840v1	FreeA: Human-object Interaction Detection using Free Annotation Labels	Yuxiao Wang	2024-03-04	Show Abstract	PDF
17	2403.01560v1	Rethinking CLIP-based Video Learners in Cross-Domain Open-Vocabulary Action Recognition	Kun-Yu Lin	2024-03-03	Show Abstract	PDF	Link
18	2312.05588v2	Language-assisted Vision Model Debugger: A Sample-Free Approach to Finding and Fixing Bugs	Chaoquan Jiang	2023-12-09	Show Abstract	PDF
19	2403.01422v1	MovieLLM: Enhancing Long Video Understanding with AI-Generated Movies	Zhende Song	2024-03-03	Show Abstract	PDF
20	2311.00453v2	CLIP-AD: A Language-Guided Staged Dual-Path Model for Zero-shot Anomaly Detection	Xuhai Chen	2023-11-01	Show Abstract	PDF
21	2403.01209v1	Data-free Multi-label Image Recognition via LLM-powered Prompt Tuning	Shuo Yang	2024-03-02	Show Abstract	PDF
22	2308.15109v2	DiffusionVMR: Diffusion Model for Joint Video Moment Retrieval and Highlight Detection	Henghao Zhao	2023-08-29	Show Abstract	PDF
23	2306.12105v2	Mass-Producing Failures of Multimodal Systems with Language Models	Shengbang Tong	2023-06-21	Show Abstract	PDF	Link
24	2403.00939v1	G3DR: Generative 3D Reconstruction in ImageNet	Pradyumna Reddy	2024-03-01	Show Abstract	PDF
25	2403.00436v1	Abductive Ego-View Accident Video Understanding for Safe Driving Perception	Jianwu Fang	2024-03-01	Show Abstract	PDF
26	2403.00376v1	Invariant Test-Time Adaptation for Vision-Language Model Generalization	Huan Ma	2024-03-01	Show Abstract	PDF
27	2306.08173v2	Safeguarding Data in Multimodal AI: A Differentially Private Approach to CLIP Training	Alyssa Huang	2023-06-13	Show Abstract	PDF	Link
28	2402.19467v2	TV-TREES: Multimodal Entailment Trees for Neuro-Symbolic Video Reasoning	Kate Sanders	2024-02-29	Show Abstract	PDF
29	2402.15021v2	CLoVe: Encoding Compositional Language in Contrastive Vision-Language Models	Santiago Castro	2024-02-22	Show Abstract	PDF	Link
30	2403.00219v1	Multi-modal Attribute Prompting for Vision-Language Models	Xin Liu	2024-03-01	Show Abstract	PDF
31	2402.19479v1	Panda-70M: Captioning 70M Videos with Multiple Cross-Modality Teachers	Tsai-Shien Chen	2024-02-29	Show Abstract	PDF
32	2309.16782v2	Surgical Tattoos in Infrared: A Dataset for Quantifying Tissue Tracking and Mapping	Adam Schmidt	2023-09-28	Show Abstract	PDF
33	2403.00853v1	Distributed Momentum Methods Under Biased Gradient Estimations	Ali Beikmohammadi	2024-02-29	Show Abstract	PDF
34	2402.19150v1	Typographic Attacks in Large Multimodal Models Can be Alleviated by More Informative Prompts	Hao Cheng	2024-02-29	Show Abstract	PDF
35	2402.19091v1	Leveraging Representations from Intermediate Encoder-blocks for Synthetic Image Detection	Christos Koutlis	2024-02-29	Show Abstract	PDF
36	2401.04350v2	Pre-trained Model Guided Fine-Tuning for Zero-Shot Adversarial Robustness	Sibo Wang	2024-01-09	Show Abstract	PDF
37	2402.18490v1	TAMM: TriAdapter Multi-Modal Learning for 3D Shape Understanding	Zhihao Zhang	2024-02-28	Show Abstract	PDF
38	2402.18400v1	Balanced Similarity with Auxiliary Prompts: Towards Alleviating Text-to-Image Retrieval Bias for CLIP in Zero-shot Learning	Hanyao Wang	2024-02-28	Show Abstract	PDF
39	2310.01018v2	Controlling Vision-Language Models for Multi-Task Image Restoration	Ziwei Luo	2023-10-02	Show Abstract	PDF	Link
40	2402.13250v3	Video ReCap: Recursive Captioning of Hour-Long Videos	Md Mohaiminul Islam	2024-02-20	Show Abstract	PDF
41	2402.17412v2	DiffuseKronA: A Parameter Efficient Fine-tuning Method for Personalized Diffusion Models	Shyam Marjit	2024-02-27	Show Abstract	PDF
42	2402.17930v1	Pragmatic Instruction Following and Goal Assistance via Cooperative Language-Guided Inverse Planning	Tan Zhi-Xuan	2024-02-27	Show Abstract	PDF	Link
43	2402.17535v1	Multimodal Learned Sparse Retrieval with Probabilistic Expansion Control	Thong Nguyen	2024-02-27	Show Abstract	PDF	Link
44	2402.17205v1	Measuring Vision-Language STEM Skills of Neural Models	Jianhao Shen	2024-02-27	Show Abstract	PDF

Index	Arxiv ID	Title	First Author	Submit Date	Abstract	PDF Links	Paper With Code
0	2310.06836v2	What Does Stable Diffusion Know about the 3D Scene?	Guanqi Zhan	2023-10-10	Show Abstract	PDF	Link
1	2403.00459v2	Deformable One-shot Face Stylization via DINO Semantic Guidance	Yang Zhou	2024-03-01	Show Abstract	PDF	Link
2	2309.10726v3	Few-Shot Panoptic Segmentation With Foundation Models	Markus Käppeler	2023-09-19	Show Abstract	PDF	Link
3	2402.18362v1	Objective and Interpretable Breast Cosmesis Evaluation with Attention Guided Denoising Diffusion Anomaly Detection Model	Sangjoon Park	2024-02-28	Show Abstract	PDF
4	2402.15687v1	General Purpose Image Encoder DINOv2 for Medical Image Registration	Xinrui Song	2024-02-24	Show Abstract	PDF
5	2402.14976v1	Unsupervised Domain Adaptation within Deep Foundation Latent Spaces	Dmitry Kangin	2024-02-22	Show Abstract	PDF
6	2402.14957v1	The Common Stability Mechanism behind most Self-Supervised Learning Approaches	Abhishek Jha	2024-02-22	Show Abstract	PDF	Link
7	2402.14566v1	Self-supervised Visualisation of Medical Image Datasets	Ifeoma Veronica Nwabufo	2024-02-22	Show Abstract	PDF	Link
8	2402.13181v1	DINOBot: Robot Manipulation via Retrieval and Alignment with Vision Foundation Models	Norman Di Palo	2024-02-20	Show Abstract	PDF
9	2402.10513v2	Understanding Delays in AF\_XDP-based Applications	Killian Castillon du Perron	2024-02-16	Show Abstract	PDF
10	2402.10793v1	Masked Attention is All You Need for Graphs	David Buterez	2024-02-16	Show Abstract	PDF
11	2402.10717v1	BioFusionNet: Deep Learning-Based Survival Risk Stratification in ER+ Breast Cancer Through Multifeature and Multimodal Data Fusion	Raktim Kumar Mondol	2024-02-16	Show Abstract	PDF
12	2311.18237v2	Knowledge Transfer from Vision Foundation Models for Efficient Training of Small Task-specific Models	Raviteja Vemulapalli	2023-11-30	Show Abstract	PDF
13	2402.09608v1	Exact, Fast and Expressive Poisson Point Processes via Squared Neural Families	Russell Tsuchida	2024-02-14	Show Abstract	PDF	Link
14	2402.06287v1	AI, Meet Human: Learning Paradigms for Hybrid Decision Making Systems	Clara Punzi	2024-02-09	Show Abstract	PDF
15	2402.03138v1	Just Cluster It: An Approach for Exploration in High-Dimensions using Clustering and Pre-Trained Representations	Stefan Sylvius Wagner	2024-02-05	Show Abstract	PDF
16	2402.02851v1	Enhancing Compositional Generalization via Compositional Feature Alignment	Haoxiang Wang	2024-02-05	Show Abstract	PDF	Link
17	2402.02352v1	Region-Based Representations Revisited	Michal Shlapentokh-Rothman	2024-02-04	Show Abstract	PDF
18	2304.07193v2	DINOv2: Learning Robust Visual Features without Supervision	Maxime Oquab	2023-04-14	Show Abstract	PDF	Link
19	2401.17981v1	Enhancing Multimodal Large Language Models with Vision Detection Models: An Empirical Study	Qirui Jiao	2024-01-31	Show Abstract	PDF
20	2401.17632v1	What Do Self-Supervised Speech and Speaker Models Learn? New Findings From a Cross Model Layer-Wise Analysis	Takanori Ashihara	2024-01-31	Show Abstract	PDF
21	2310.08873v2	Interactive Navigation in Environments with Traversable Obstacles Using Large Language and Vision-Language Models	Zhen Zhang	2023-10-13	Show Abstract	PDF
22	2401.05925v3	CoSSegGaussians: Compact and Swift Scene Segmenting 3D Gaussians with Dual Feature Fusion	Bin Dou	2024-01-11	Show Abstract	PDF
23	2401.14555v1	Revisiting Active Learning in the Era of Vision Foundation Models	Sanket Rajan Gupte	2024-01-25	Show Abstract	PDF	Link
24	2401.14159v1	Grounded SAM: Assembling Open-World Models for Diverse Visual Tasks	Tianhe Ren	2024-01-25	Show Abstract	PDF	Link
25	2401.13987v1	Cross-Domain Few-Shot Learning via Adaptive Transformer Networks	Naeem Paeedeh	2024-01-25	Show Abstract	PDF	Link
26	2401.11673v1	MVSFormer++: Revealing the Devil in Transformer's Details for Multi-View Stereo	Chenjie Cao	2024-01-22	Show Abstract	PDF	Link
27	2401.11311v1	A Novel Benchmark for Few-Shot Semantic Segmentation in the Era of Foundation Models	Reda Bensaid	2024-01-20	Show Abstract	PDF
28	2401.10815v1	RAD-DINO: Exploring Scalable Medical Image Encoders Beyond Text Supervision	Fernando Pérez-García	2024-01-19	Show Abstract	PDF
29	2401.07951v1	Image Similarity using An Ensemble of Context-Sensitive Models	Zukang Liao	2024-01-15	Show Abstract	PDF
30	2401.06013v2	Surgical-DINO: Adapter Learning of Foundation Models for Depth Estimation in Endoscopic Surgery	Beilei Cui	2024-01-11	Show Abstract	PDF	Link
31	2305.14093v4	Weakly Supervised 3D Open-vocabulary Segmentation	Kunhao Liu	2023-05-23	Show Abstract	PDF	Link
32	2401.02957v1	Denoising Vision Transformers	Jiawei Yang	2024-01-05	Show Abstract	PDF
33	2401.02361v2	An Open and Comprehensive Pipeline for Unified Object Grounding and Detection	Xiangyu Zhao	2024-01-04	Show Abstract	PDF	Link
34	2211.12735v2	Fast-iTPN: Integrally Pre-Trained Transformer Pyramid Network with Token Migration	Yunjie Tian	2022-11-23	Show Abstract	PDF	Link

Arxiv Daily - 2024-03-06

Volumetric Semantically Consistent 3D Panoptic Mapping

LiSTA: Geometric Object-Based Change Detection in Cluttered Environments

Language-EXtended Indoor SLAM (LEXIS): A Versatile System for Real-time Visual Scene Understanding

Tightly-Coupled LiDAR-Visual-Inertial SLAM and Large-Scale Volumetric Occupancy Mapping

Structure from WiFi (SfW): RSSI-based Geometric Mapping of Indoor Environments

Collaborative Dynamic 3D Scene Graphs for Automated Driving

Dive Deeper into Rectifying Homography for Stereo Camera Online Self-Calibration

SGS-SLAM: Semantic Gaussian Splatting For Neural Dense SLAM

Grid-based Fast and Structural Visual Odometry

BotanicGarden: A High-Quality Dataset for Robot Navigation in Unstructured Natural Environments

Joint Spatial-Temporal Calibration for Camera and Global Pose Sensor

High-Speed Detector For Low-Powered Devices In Aerial Grasping

VPRTempo: A Fast Temporally Encoded Spiking Neural Network for Visual Place Recognition

DISORF: A Distributed Online NeRF Training and Rendering Framework for Mobile Robots

Benchmarking ground truth trajectories with robotic total stations

NARUTO: Neural Active Reconstruction from Uncertain Target Observations

SD-SLAM: A Semantic SLAM Approach for Dynamic Scenes Based on LiDAR Point Clouds

Generation of skill-specific maps from graph world models for robotic systems

MoD-SLAM: Monocular Dense Mapping for Unbounded 3D Scene Reconstruction

VOOM: Robust Visual Object Odometry and Mapping using Hierarchical Landmarks

Modeling Point Uncertainty in Radar SLAM

An Error-Matching Exclusion Method for Accelerating Visual SLAM

A Feature Matching Method Based on Multi-Level Refinement Strategy

VOLoc: Visual Place Recognition by Querying Compressed Lidar Map

CoLRIO: LiDAR-Ranging-Inertial Centralized State Estimation for Robotic Swarms

Ground-Fusion: A Low-cost Ground SLAM System Robust to Corner Cases

Secure Navigation using Landmark-based Localization in a GPS-denied Environment

Khronos: A Unified Approach for Spatio-Temporal Metric-Semantic SLAM in Dynamic Environments

EffLoc: Lightweight Vision Transformer for Efficient 6-DOF Camera Relocalization

How NeRFs and 3D Gaussian Splatting are Reshaping SLAM: a Survey

Particle Filter SLAM for Vehicle Localization

Landmark-based Localization using Stereo Vision and Deep Learning in GPS-Denied Battlefield Environment

MLFEF: Machine Learning Fusion Model with Empirical Formula to Explore the Momentum in Competitive Sports

3D Point Cloud Compression with Recurrent Neural Network and Image Compression Methods

Loopy-SLAM: Dense Neural SLAM with Loop Closures

RB5 Low-Cost Explorer: Implementing Autonomous Long-Term Exploration on Low-Cost Robotic Hardware

An Embarrassingly Simple Approach for LLM with Strong ASR Capacity

3D Active Metric-Semantic SLAM

Customizable Perturbation Synthesis for Robust SLAM Benchmarking

UAV-assisted Visual SLAM Generating Reconstructed 3D Scene Graphs in GPS-denied Environments

Semantic Object-level Modeling for Robust Visual Camera Relocalization

SLAM for Visually Impaired People: a Survey

PAS-SLAM: A Visual SLAM System for Planar Ambiguous Scenes

Preferential Multi-Target Search in Indoor Environments using Semantic SLAM

Online and Certifiably Correct Visual Odometry and Mapping

Efficient Invariant Kalman Filter for Inertial-based Odometry with Large-sample Environmental Measurements

Adaptive Denoising-Enhanced LiDAR Odometry for Degeneration Resilience in Diverse Terrains

A Robust Error-Resistant View Selection Method for 3D Reconstruction

GaussianPro: 3D Gaussian Splatting with Progressive Propagation

Speech Translation with Speech Foundation Models and Large Language Models: What is There and What is Missing?

Dense Matchers for Dense Tracking

PixTrack: Precise 6DoF Object Pose Tracking using NeRF Templates and Feature-metric Alignment

On-the-Fly SfM: What you capture is What you get

LightGaussian: Unbounded 3D Gaussian Compression with 15x Reduction and 200+ FPS

Enlighten-Your-Voice: When Multimodal Meets Zero-shot Low-light Image Enhancement

Local Feature Matching Using Deep Learning: A Survey

Fusing Structure from Motion and Simulation-Augmented Pose Regression from Optical Flow for Challenging Indoor Environments

PoseDiffusion: Solving Pose Estimation via Diffusion-aided Bundle Adjustment

Speech foundation models on intelligibility prediction for hearing-impaired listeners

HG3-NeRF: Hierarchical Geometric, Semantic, and Photometric Guided Neural Radiance Fields for Sparse View Inputs

SCENES: Subpixel Correspondence Estimation With Epipolar Supervision

3D Scene Geometry Estimation from 360$^\circ$ Imagery: A Survey

ICON: Incremental CONfidence for Joint Pose and Radiance Field Optimization

Cross-Modal Semi-Dense 6-DoF Tracking of an Event Camera in Challenging Conditions

A vision-based autonomous UAV inspection framework for unknown tunnel construction sites with dynamic obstacles

Photometric Correction for Infrared Sensors

Structure from Duplicates: Neural Inverse Graphics from a Pile of Objects

A Classification of Critical Configurations for any Number of Projective Views

Research on Multilingual Natural Scene Text Detection Algorithm

Yes, we CANN: Constrained Approximate Nearest Neighbors for local feature-based visual localization

Pose-Free Generalizable Rendering Transformer

Residual Learning for Image Point Descriptors

NeuSurf: On-Surface Priors for Neural Surface Reconstruction from Sparse Input Views

Transformers in Unsupervised Structure-from-Motion

HeadRecon: High-Fidelity 3D Head Reconstruction from Monocular Video

CF-NeRF: Camera Parameter Free Neural Radiance Fields with Incremental Learning

COLMAP-Free 3D Gaussian Splatting

Keypoint-based Stereophotoclinometry for Characterizing and Navigating Small Bodies: A Factor Graph Approach

Gaussian Splatting SLAM