Research

BioDataScience Lab

Professor Ivica Šamanić and I formed the BioDataScience Lab whose main purpose is to establish a pipeline for biological data processing - from de novo genome assembly to functional predicton of metagenomic samples - in order to set foundations for subseqent use of AI models on biological data.

In July 2024 we held a workshop for colleagues and students and although the workshop was planned to be in Croatian only we had international participants.

You can see snippets of materials (in Croatian) from the workshop here:

The plasmids data we used on worksop are generated by whole genome sequencing (WGS) technology and came from ARGAS project - a Croatian Science Foundation project led by professor Ana Maravić.

Large language models

For demonstration purposes we run several generative models on our server. One of the first model was a model to help colleagues to correct grammar and rewrite (or generate) a text. With the expansion of our team we stared using generative modeling for inpainting with intention to build up upon existing research in imputed satellite data reconstruction. Due to the limited resources these models were made available for testing only for IPs within University and were password restricted. If you are interested to test capabilites of our model or see a demo please contact me.

University of Basel visit

In March 2023 I visited laboratory of Professor Ivan Dokmanić at Univeristy of Basel.

ERC-2023-CoG grant application

On February 2nd 2023 I submitted my proposal to HORIZON-ERC call under topic ERC-2023-COG. The title of my proposal is "StrateGEM - Strategic Game Enforcing Mechanisms", and here is the short summary of my proposal:

With the rise of autonomous machines our reality becomes a complex structure interweaved with interactions among multiple intelligent agents. The current pace of development suggests that the number of machine-based intelligent agents will significantly outnumber humans, making direct communication between humans statistically unlikely. This could have significant disruptive effect to society and its structure. In the history of mankind, we have seen several times how (hierarchical) structures have emerged almost by themselves due to agglomeration effects, physical limitations, limited resources or efficiency requirements. The last two examples are the internet and cryptocurrencies. The StrateGEM project raises the question of whether and under what circumstances and assumptions it is possible to enforce the direction of the "game" played by multiple agents in order to control this game. The goal is to identify game-enforcing mechanisms (GEMs), measure their scope and impact, and perhaps even use them as a fulcrum to achieve strategic dominance. To do this, we need to examine the internal structure of intelligent agents, the communication between them and the mechanisms that drive the evolution of the game. Since GEMs can be used to steer development in one direction or another, we are also interested in whether it is possible to develop a strategy that preserves certain values within the structure of the game, such as democracy, and what price each agent has to pay for this.

Environmental data analysis laboratory

Data can come from a variety of sources: natural environment, virtual environment, social networks, (tele)communication networks, IoT, P2P networks (such as Blockchain), etc. Regardless of the data source, things such as veracity, accountability, reliability and security are becoming increasingly important. The evolving machine world grows around us and more and more problems are being automated and many tasks are being solved faster. Machine-human interaction is becoming our daily reality, and due to the pace of change, more and more decisions are being delegated to autonomous machines. In our lab, we are preparing for this reality.

CURRENT RESEARCH & COLLABORATIONS:

- Satellite data, colab: PML via Simons foundation

  • + We aim at transferring this knowledge to communication and P2P networs (e.g. small ARM devices).

- Traffic data, colab: UniZg, HEVS, TCD

  • + Digital twin & Augmented data for crytical but rare events.

- Genome analysis, colab: Biology department

Software sensor augmentation

While software sensors can be considered as virtual sensors used to identify the state of the system or measure latent (unobservable) variables to gain insight into the state of the system (as in XAI), software sensors can be used as estimators of real variables, offering advantages such as reducing the number of physical sensors or allowing less expensive sensors to be used as proxies to estimate information that would normally come from more expensive devices or measurement endevaours. In ocean engineering, an example could be a UV sensor used as a proxy to estimate solar radiation, or a fusion of multiple sensors to estimate fish abundance, or the use of fibre-optic underwater cables to monitor seismic activity. Other approaches focus on resource optimization, such as placing sensors to achieve optimal coverage, or selecting the optimal number of sensors to accurately monitor a predefined area. Each of these applications augments the capabilities of the sensor in terms of its use or coverage. The idea of a software sensor and the augmentation of a sensor by a machine learning model can be extended to many different scenarios commonly associated with IoT, e.g. optimising the energy consumption of sensors and estimating the veracity of (software) sensors. In addition to work that relates the topological architecture of a sensor network to the quality of the data acquired for a given domain, we are interested in applications that use deep neural networks or transfer learning (e.g. for satellite or multispectral data) or deal with signal reconstruction from sparse data (as in compressed sensing or one-shot learning).

UPDATE: We have published 2 papers Q1 journals on this topic. In them we have shown that a relatively small portion of the area needs to be covered in order to estimate the wind over the Adriatic or the Mediterranean Sea with relatively simple machine learning models. Thus, with a small number of hardware sensors, a much larger area can be covered using software sensor aumentation. The wind data used in this study could provide useful when planning for wind energy harvesting. In general, the same data (albeit of lower quality) can be obtained in this way with fewer measurements.

StVar-Adri

The project "Strength and Variability of the Adriatic sea level extremes in present and future climate" recieved funding from Croatian Science Foundation in December 2020. I am senior researcher at that project. The link to the project webpage can be found here.

SHExtreme

The project funded by the European Research Council, entitled "Estimating contribution of sub-hourly sea level oscillations to overall sea level extremes in changing climate" started in October 2020. I am senior researcher at that project. The link to the project webpage can be found here.

SSA@EDAL

The project entitled "Software Sensor Development at Environmental Data Analysis Laboratory" (SSA@EDAL) recieved funding from Croatian Science Foundation in Febuary 2020. I am a principal investigator and the project leader. The link to the project webpage can be found here.

AIMS & GOALS: (a) missing data reconstruction, (b) reducing the number of measuring station, or frequency (that may lower the cost), (c) development of a new type of sensor, (d) faster initialization or adaptation

Metagenome

Recently in collaboration with collegues from Department of Biology, I started working on metagenome data analysis.

UPDATE: Part of this research has been merged into the Croatian Science Foundation project "Seasonal and spatial distribution of antibiotic resistance genes in marine microbial communities along a trophic gradient in central Adriatic Sea (ARGAS)". ARGAS started in December 2019.

Machine Learning Group

Group focuses modern machine learning techniques with special attention to deep learning and deep neural networks. Apart from supervised and unsupervised frameworks for learning we investigate transfer learning, semi-supervised learning, multitask learning and reinforcement learning. Applications range from economy and medicine, to traffic and physics or language and multimedia. We are especially motivated to mine big data (such as satellites) or stream data (such as multimedia or IoT). We work in Python and R, and main tools are sklearn, TensorFlow, Keras and PyTorch. Prospective students are welcome.

UPDATE: Professor Agić moved to industry (Denmark).

UPDATE: Professor Kalinić received an installation research grant (Croatian science foundation) and formed a new research group.

Technion visit

In November 2019 I visited laboratory of Professor Alex Bronstein at Technion.