Real-Time Data Transfer from PostgreSQL to MySQL

I have two machines: Industrial PC (IPC) which has W10 and my laptop which has a Ubuntu-Linux running on a VM
Databases: IPC has Monorepo/PostgreSQL data from the automation line
My Django has mySQL.

When I plug an ethernet cable from my laptop to IPC, the PostgreSQL data starts to flow into my models.py in my monorepo app.

Problem is: Whenever Industrial PC (IPC) produces timestamps, I’d like to fetch those data and display on the web. I’d like to do data transfer from PostgreSQL to MySQL. When the ethernet cable plugged, I am able to retrieve those postgreSQL data via Python Shell.

I am planning to do use Django Signals to get notification that the data is received from my monorepo models. And then I am planning to handle those data using Celery, Channels and WebSockets, and I aim that whenever postgreSQL produces data I need, I’d like to see them on the specific section of the specific page without refreshing page. What would be the steps to implement?

In addition to this, do I need a specific database driver for data transfer between two databases such as mysqlclient and psycopg2?

DATABASES = {
‘default’: {
‘ENGINE’: ‘django.db.backends.mysql’,
‘OPTIONS’: {
‘read_default_file’: ‘/etc/mysql/my.cnf’,
},
},
‘MICE’: {
‘ENGINE’: ‘django.db.backends.postgresql’,
‘NAME’: ‘app’,
‘USER’: ‘postgres’,
‘PASSWORD’: ‘pass’,
‘HOST’: ‘x.x.x.x’,
‘PORT’: ‘5432’,
‘POSTGRES_NAME’:“postgres”,
‘POSTGRES_USER’:“postgres”,
‘POSTGRES_PASSWORD’ : “pass”,
}
}

Have you worked your way through the official Django Channels tutorial? If not, that’s your best place to start.

The Chat application really is a good analogy, if you consider the data source as being the sender of the chat messages and your display application being the recipient of those messages.

Once you’ve worked your way through the tutorial to the point where you understand what everything it doing, then you’ll be in a really good position to implement your process.

That’s probably the worse thing to try and use here. If you’re going to use Channels, use it all the way through.

Hi Ken,

I really wanted to get feedback from you. I appreciated that.

Okey, I’ll look at that Django Channels tutorial. I will also remove Signals as recommended. But I think that’s not the whole point. At the end, I will need to fetch many data(attributes) from many models. These real-time data will be put in some specific sections or cards. That’s why this leads me to think about fetching data and processing them periodically using Celery. I imagine that I’ll have massive tasks.py for that. I’d like to get your feedback about the use of Celery for this task.

And also, do you see any extra database configuration related to mysqlclient and psycop2?

I’d need a lot more specific details about your requirements to even begin to think about offering suggestions. (General terms such as “many”, “some”, “periodically” have different meanings to different people.)

Django works quite well with multiple databases. See Multiple databases | Django documentation | Django

When the LAN connection is on, PostgreSQL sends real-time data from the automation cell to my common model’s attributes, such as date_started, date_done, and command. The first two are timestamps, whereas the command (integer field) specifies the specific process. If it is 1, the timestamps needs to go to that specific process. What I want is this: Whenever postgreSQL produces timestamps (or any data), I’d like to see them on the process card without refreshing the page.This is actually the foundation of my requirements.

What specifically and technically, do you mean by:

How does PostgreSQL send this data?

How does PostgreSQL produce this data?

How does PostgreSQL send this data?

from celery.app import shared_task
from billiard.exceptions import SoftTimeLimitExceeded
from django.utils import timezone
from django.conf import settings
import logging
import requests

LOGGER = logging.getLogger(__name__)


def create_event_message(event_type, event_data, event_timestamp):
    return {
        "line_name": settings.LINE_NAME,
        "cell_name": settings.CELL_NAME,
        "sender_name": settings.SENDER_NAME,
        "timestamp": timezone.now().utcnow().isoformat(),
        "message_type": "events",
        "message": {
            "event_type": event_type,
            "event_message": "",
            "event_data": {**event_data},
            "event_timestamp": event_timestamp,
        },
        "meta": {},
    }


@shared_task
def publish_event(event_type: str, event_data: dict, event_timestamp: str) -> None:
    """[Generic method to publish an event to an endpoint. Can be called from a Celery worker]

    Args:
        event_type (str): [type of the event]
        event_data (dict): [all data relevant to the event]
        event_timestamp (str): [timestamp when the event occured]

    """
    try:
        _publish_event(event_type, event_data, event_timestamp)
    except SoftTimeLimitExceeded:
        LOGGER.error("publish_event timeout exceeded!")
        return False


def _publish_event(event_type: str, event_data: dict, event_timestamp: str) -> None:
    if settings.EVENT_STORE_ENABLED:
        try:
            requests.post(
                settings.EVENT_STORE_ENDPOINT,
                json=create_event_message(event_type, event_data, event_timestamp),
                timeout=5,
            )
        except requests.exceptions.Timeout:
            LOGGER.error("Event message timed out!")
        except Exception as exc:
            LOGGER.error("An Exception occurred during sending the message!: ", str(exc))
    else:
        LOGGER.info(
            f"Event Store was disabled, missed the following event: '{event_type}', {event_data} occurring at {event_timestamp}"
        )

First a side note: You need to enclose the code between lines of three backtick - ` characters to maintain the proper formatting. (I fixed this for you on this post.)

Unfortunately, this code, without any context, doesn’t help at all. (This also doesn’t appear to imply PostgreSQL doing anything at all as part of this.)

Please provide an actual description of what you’re trying to do, with enough of an explanation of what the components involved are to be able to understand what the source and destination of data is at each step.

Why did you think that the usage of Django Signals would be the worst thing in this issue? Both devices have databases and codebases. PostgreSQL produces data when the automation cell is running and whenever it produces data, the goal is displaying that in the web in real-time.

Does Django Signals notify us that we have that?

I’d like to know: What would be your approach to this issue? Channels are fine. The problem is acquiring data. Would you create models for that incoming data and store them?

Which file should handle the incoming data from PostgreSQL first? Would you directly handle them with Celery tasks.py?