Principal Investigator : Dr Lama Tarsissi
The research project "Combinatorics on Words + Digital Geometry = Data Compression" (CoW+DG=DC), explores the synergy between combinatorics on words and digital geometry to develop innovative data compression solutions. Targeting the challenges of managing vast, heterogeneous datasets—such as genomic sequences, social graphs, and 3D point clouds—the project leverages theoretical frameworks to optimize storage, indexing, and pattern detection.
The study is structured around four axes: (1) analyzing two-dimensional patterns in digital planes to uncover combinatorial invariants; (2) developing algorithms for generating and recognizing digital planes using multidimensional continued fractions, enhancing 3D data processing; (3) exploring practical applications in fields like bioinformatics, cybersecurity, and text compression; and (4) fostering international collaboration through a workshop at Sorbonne University Abu Dhabi. Key objectives include improving compression of symbolic sequences, detecting regularities in data streams, and advancing 3D reconstruction techniques for noisy datasets, such as those from LIDAR or medical imaging. By bridging abstract theory and practical outcomes, this project promises significant contributions to data science, artificial intelligence, and large-scale information management, aligning with the evolving needs of language models and modern data paradigms.