Capturing live data between Django databases

englewood6 · May 22, 2024, 1:37pm

When I pass data via query (Python Shell) to the database, I can display the data on the web without refreshing the page. However, when an Ethernet cable, an internal default switch of the VM, and multiple databases are involved, Django can’t capture the data and display the data on the web.

When I plug an Ethernet cable between devices, data flows into my app because I use the multiple databases feature of Django in the settings.py. But I don’t understand why Django cannot capture the data asynchronously and push it to the web. What’s the conventional way to do this? Sender device has PostgreSQL data whereas receiver Django has MySQL.

I use signals.py to capture it, where the data flows into my app. When I do a query, I can retrieve the data I want, but I can’t display it on the web.

KenWhitesell · May 22, 2024, 2:30pm

How? Using what tools or web interface?

Having the ethernet cable alone doesn’t cause data to flow.

It can, but there’s either a conceptual or terminology issue here needing to be resolved.

Things don’t just “happen” by themselves. There’s some process involved with causing these things to occur.

Most importantly, it’s not the database (neither PostgreSQL nor MySQL) that initiates data transfers. There’s some other process involved here, and that is where we need more clarity.

englewood6 · May 22, 2024, 3:24pm

Python Shell (python manage.py shell) provides API, right? We can manipulate db from there. When I pass data from there, I can change the data on the web without refreshing the page.

For second question, I use Django multiple databases feature, which means I added the second database(PostgreSQL) in the settings.py. That allows me to connect to external db. When I ping , I can get packets. So it is fine.

I have two apps involved in this process. One of them receives the data, let’s call it App1. When I do query via Python Shell, I can retrieve those timestamps or any data that models.py of App1 has. App2 is responsible for displaying the data.

App1/signals.py tries to handle whenever PostgreSQL produces data (there is a large codebase that sends those data continuosly if ethernet cable is plugged) whereas App1/tasks.py updates the relevant App2/models.py’s attributes with App1/models.py’s attributes. Two attributes should be updated with new data without refreshing the page.

The problem is capturing those data. Once I capture them, I can display them. WebSocket works.

KenWhitesell · May 22, 2024, 4:27pm

Not precisely, no. The Python shell gives you interactive access to the Django ORM API, but it does not provide any APIs of its own.

This is absolutely correct, yes.

Is that a question or a statement? If a statement, how are you doing it now?
If it is a question, yes. You can create a system that causes the browser to update the data being displayed without a page refresh. (It involves some JavaScript in the browser, but it can be done.)

PostgreSQL does not “produce” data. PostgreSQL stores and retrieves data upon request by a separate process.

This would all happen within the server. A browser has no part in this.

englewood6 · May 22, 2024, 4:49pm

PostgreSQL stores the Device 1’s data which is the sender. MySQL stores the Device 2’s data which is the receiver. Both of them has Django but the databases are different.

When I can change the data on the web without refreshing the page using Django ORM API in Device 2’s Django. But when multiple databases involve such as Device 1’s PostgreSQL which is the external db, Device 2’s MySQL in Django cannot capture the data asynchronously. But when I do query in Device 2’s MySQL db using shell via ORM API, I can see that data I need is there!.

I am investigating why I cannot capture the data I need (remember I can update the desired value by passing dummy data via query so my WebSocket works!) and update the attributes. Is this a connection problem? Database configuration? Because I am confident that once I capture data, I can display them.

KenWhitesell · May 22, 2024, 5:59pm

Again, is this a statement or a question? If a statement, please expand on what you’re doing. If it’s a question, what is the question?

Again, is this a question or a statement? If it’s a question, yes, this can be done.

Almost certainly, it’s a code problem. If you’re connecting to both databases from the same Django project, then it’s neither a connection nor configuration issue.

But at this point we’re probably going to need to see the code involved to make any further progress.

KenWhitesell · May 22, 2024, 6:58pm

Ok, there may be a misunderstanding here of what signals are in Django and how they work.

They do not trigger based upon any event. They are triggered within a specific instance of a Django process by code in that process performing the action that triggers the event.

Django signals are not a form of an IPC or a general event-monitoring system.

If you are looking to send updates out through a web socket, then the code that is saving the data to the database needs to forward the data through the channels layer in addition to writing the data to the database.

englewood6 · May 22, 2024, 7:14pm

So, signals are redundant for this. How about Celery task in App1? Instead of signals.py, should I leverage Channels only? How should be the order or steps?

KenWhitesell · May 22, 2024, 7:28pm

If both processes are sending data out through channels to go out via websockets, I’d send the data through the channels layer from each project (or celery task) as necessary.

If the desired flow of data is project 1 -> browser and project 2 -> browser, then I would have the appropriate code in each project sending the data through the channels layer at whatever point in the code that you want to do that.

englewood6 · May 22, 2024, 7:52pm

I don’t understand project1 and project2 but I think you are suggesting that remove the signals and do the whole operation via Celery/tasks.py.

App1 gets the real-time data. App2’s signals.py sends data to Django consumer, and it transmits the WebSocket messages to frontend. I suppose you suggested that operations should be done by Celery/ tasks.py for each app instead of Django signals.

KenWhitesell · May 22, 2024, 8:08pm

How? (What is it doing?) What is it doing with this data after it has gotten it?

How does Celery fit into this?

What does app2 do with this data? Is it intended to only serve as an intermediary to the consumers?

It may be better (and possibly easier) if you temporarily forget about your current implementation, and describe the architecture of the solution that you are trying to build. Don’t try to address this in terms of code, but describe what systems are involved, and what needs to happen among them.

englewood6 · May 22, 2024, 8:57pm

Industrial PC has a PostgreSQL database. It has a Django project itself. It works with the PLC and the whole automation system. Whenever the process starts, IPC’s code produces timestamps at that moment.

My Django has MySQL. App1 and App2 belong to my Django. App1’s models.py are empty when the Ethernet cable is not plugged. I can see the IPC’s data in App1’s models.when the Ethernet cable is plugged. So clear, right?

App 2 is the main app. It contains the timestamps. App 1 has live timestamps from PostgreSQL. Those live timestamps need to replace App 2’s timestamps.

Signals had been thought of as incoming data handlers. But obviously, it will be removed. The thinking here is handling the incoming data in App1 and passing it to App2 so that we can display the real-time timestamps without refreshing the page (App2/templates) in real-time.

So it is basically a real-time data transfer from two different devices that have two different databases. I don’t understand the emphasis on general architecture. Data is in Django’s App1 already. The problem is how to convey it to the browser.

Celery’s issue is handling command attributes and timestamps periodically. It was thought of as the as the first stop for incoming data. Data needs to be captured first and then conveyed to App2.

The main issue is how to handle incoming data. What is the most conventional or solid way to do this?

KenWhitesell · August 18, 2024, 2:02pm

Clear, but not particularly helpful. All it does is specify that Django is accessing some database on a different system - in this case, the IPC.

Because your usage of terminology is non-standard, and does not adequately express what is physically happening here. If “App 1” and “App 2” are Django apps within the same project, then there is no “passing” of data between the two - at least not without some clarification of what you mean by this.
If “App 1” and “App 2” are Django apps within two different projects, then that just raises different questions.
(And I’m not entirely sure that you are using the terms “App 1” and “App 2” as Django uses them, or if the term “App” has some other meaning to you.)

Which leads to:

Actually, it’s not. The data is in your database. The Django app is a means of accessing that data. (You’ve identified that earlier by saying that if there is no network connection, you don’t have access to the data.)

So if I’m understanding you correctly:

The IPC is writing data to a PostgreSQL database. It is writing to that database directly, not through your Django app.
You want to have Django recognize that this data has been written, read that new data, and write it to a different (MySQL) database.

If so, then no - Django is not going to do this for you. Writing data directly to the PostgreSQL database is not going to trigger any action on the part of Django to process that data on its own. Something outside this process needs to do that.

Either:

Your IPC needs to send the data directly to Django instead of writing it to the PostgreSQL database. (This can be done as an HTTP API/REST-style call using the standard mechanisms, or as a custom listener running as a separate process.)
You need to have a trigger, or some other database-based process that can send a message to Django to let it know that new data has been written. (It can either just send a notification, or it can send the data itself as described above.)
Django can poll the database periodically to check for new data to be copied. (This would typically be done outside the context of your regular Django process.)

In all of these situations, it would then be Django’s responsibility to copy the data from PostgreSQL to MySQL - with the additional potential function of making that data available to a browser.

This can be done a couple different ways, and it depends upon how your browser interface is designed.

If your front end relies on polling the server for data, then the view being called would query the database to retrieve the data to be shown, and return it to the browser.

If you want it to be a “server push” situation, then you would need something like Channels to handle the websocket connection with the browser. The Django app handling the data could send the updated data through the channel layer to Channels, to have it distributed to the browsers.

englewood6 · September 3, 2024, 8:27pm

I am using the third option of yours, which is polling via Celery Task in my Django app. To do that, I integrated a new model that stores the last processed timestamp from PostgreSQL database. However, because the model holds that historical data, I am struggling to poll the live data after the execution.

Which filtering can you recommend during the ORM query for that external database? Besides, do you recommend that creating a model for that live timestamps and manipulating the query with that? Third option could be putting a simple ‘if’ solutions.

KenWhitesell · September 3, 2024, 10:40pm

If you’re using the timestamp to pull data, you’ll want to ensure that your timestamp field has an index on it.

I’m not sure I understand what you’re asking for here.

englewood6 · September 3, 2024, 11:03pm

I am currently polling the timestamps successfully. I am using ‘using’ in the query of the external db and leveraging managed=False in the models.py to discard the local database.

The data (last record or timestamp) in the timestamp model that is implemented from me with its id is automatically pushed after the execution. But it belongs to yesterday or day before etc. What I need is live data that will come from the ids in the interval of threshold and limit. I need to put a blocker that blocks this historical data and poll only live data. Does it make any sense?

KenWhitesell · September 3, 2024, 11:25pm

I’m not sure I’m clear here.

From what I’m understanding, it seems to me that you want to pull all rows from the originating model with a timestamp greater than the maximum timestamp in the target model. Is that it? Or am I missing something?

englewood6 · September 4, 2024, 12:04am

I am polling all rows of the specific attribute (timestamp-date) from the originating model in the certain interval that is defined by threshold and limit. That’s correct. Because I don’t want to start where I left it off, I use that interval. For example, id is 1000 where threshold is -300 and limit is +800. So the data is queried IDs between 700 and 1800. Each ID has timestamps.

On the other hand, I store that captured timestamp as a single record in my local database. My goal is the distribution of many timestamps to the certain sections of my website. The problem is that I poll historical data and push it to the web every time I perform new execution. I only need live data to be honest.

Topic		Replies	Views
Real-Time Data Transfer from PostgreSQL to MySQL Async/Channels	7	314	April 30, 2024
Capture data from a feed and store it in database. Async/Channels	3	119	May 31, 2024
POST request triggering WebSocket send Async/Channels	9	23	October 3, 2024
Are concurrent database queries in ASGI a thing? Async/Channels	1	916	October 5, 2023
Django and pylogix Getting Started	3	363	January 20, 2024

Capturing live data between Django databases

Related topics