Loading
Kapsch Group

Master’s Thesis: Monocular 3D Object Detection

Posted: 9 hours ago

Job Description

Introduction: Conventional single-shot object detection neural networks, such as YOLO, have achieved remarkable success in identifying and localizing objects within 2D images using axis-aligned rectangular bounding boxes. While effective for many applications, these 2D representations lack crucial information about the object's true 3D pose, dimensions, and orientation in the real world. This limitation becomes significant in applications requiring a deeper understanding of the scene, such as autonomous driving, robotics, and augmented reality.The goal of this thesis is to extend the capabilities of single-shot object detection networks by developing, training, and evaluating a model that directly predicts oriented 3D bounding boxes from a single monocular image. This includes estimating the object's 3D location, its dimensions (length, width, height), and its 3D orientation.Motivation: Accurate and efficient monocular 3D object detection is a crucial task in various computer vision applications. Relying on a single camera offers advantages in terms of cost, simplicity, and ease of deployment compared to multi-camera or LiDAR-based systems. This thesis aims to contribute to the advancement of monocular 3D object detection by exploring and implementing a single-shot approach capable of predicting oriented 3D bounding boxes.TasksLiterature Review on Monocular 3D Object Detection CNNs:Conduct a comprehensive review of existing research in monocular 3D object detection using Convolutional Neural Networks (CNNs).Address the issue of camera calibration: Thoroughly examine how different methods handle camera calibration parameters (intrinsic and extrinsic) and their impact on the accuracy of 3D object detection. Analyze the strengths and weaknesses of different network architectures.Investigation Of System ConstraintsIdentify and analyze the inherent challenges and constraints of monocular 3D object detection compared to methods utilizing depth information.Consider factors such as:Scale ambiguity: The difficulty in determining the absolute size and distance of an object from a single 2D image.Occlusion: How occluded objects can affect the accuracy of 3D bounding box prediction.Viewpoint variation: The impact of different viewing angles on the perceived shape and size of objects.Computational resources: Consider the computational complexity and real-time requirements for potential applications.Design and Development of an Oriented 3D Bounding Box CNN:Based on the literature review and the identified system constraints, design a novel or adapt an existing single-shot object detection CNN architecture to predict oriented 3D bounding boxes.This will involve:Choosing an appropriate backbone network.Designing the output layers to predict the parameters of the 3D bounding box (e.g., center coordinates, dimensions, Euler angles or quaternions for orientation).Defining a suitable loss function that incorporates the different aspects of 3D bounding box prediction.Training And Evaluation On Real-World DataSelect a suitable real-world dataset(s) with 3D object annotations (both Kapsch proriatory and public).Implement and train the modelEvaluate the performance of the trained model using appropriate 3D object detection metrics (e.g., Average Precision with different IoU thresholds in 3D space).Analyze the results, identify limitations, and discuss potential future improvements.Expected DeliverablesA comprehensive literature review on monocular 3D object detection This thesis provides an excellent opportunity to delve into the challenging and rapidly evolving field of monocular 3D object detection. The student will gain practical experience in literature review, deep learning model design, implementation, training, and evaluation on real-world data.CNNs.A detailed description of the designed and implemented model architecture.A thorough evaluation of the model's performance on real-world data.A written thesis document summarizing the research process, findings, and conclusions.Potentially, a working implementation of the developed model.Your ProfileRequired Background Studies in Computer Science, Software Engineering, Information Technology, Geoinformatics or related fieldsFluent English skillsInterest in technologyWillingness and ability to work independentlyExcellent communication and teamwork skillsConscientiousness and reliabilityStrong analytical skills with a precise and structured approachStart: ImmediatelyDuration: 3–6 monthsSuccessful completion of the master’s thesis will be rewarded with €3,000.ContactEdwin FrühwirthEdwin.Fruehwirth@kapsch.net

Job Application Tips

  • Tailor your resume to highlight relevant experience for this position
  • Write a compelling cover letter that addresses the specific requirements
  • Research the company culture and values before applying
  • Prepare examples of your work that demonstrate your skills
  • Follow up on your application after a reasonable time period

You May Also Be Interested In