Google Composer : dag_id could not be found - google-cloud-composer

I created a collection of dags dynamically (using the same .py for all). And there is one build-DAG that I cannot run :
airflow.exceptions.AirflowException: dag_id could not be found: `build-DAG`. Either the dag did not exist or it failed to parse.
at get_dag (/usr/local/lib/python2.7/site- packages/airflow/bin/cli.py:130)
at run (/usr/local/lib/python2.7/site-packages/airflow/bin/cli.py:353)
at <module> (/usr/local/bin/airflow:27)
For this dag I can see previous logs, code and all the stuff in the UI but I can't run it.
Any idea how to debug this?
could be useful to restart composer instances?
The rest of the dags dynamically created works fine.
I'm using something similar to this in order to create the dags:
https://gist.github.com/tmarthal/edeae7f6f8780dc53887a16b7b20f205
Thanks in advice.
Edu
Update: I'm using composer-0.5.1-airflow-1.9.0
Update August 2, 2018 : I migrated to composer-1.0.0-airflow-1.9.0 and still happen

This wasn't an airflow issue, was a concurrency issue.
I have two backends returning the list of dynamics dags, and each backend has different DAGs id list.
When the DAG definition used Backend A, create 20 DAGs and when used Backend B create only 18 DAGs.
Then airflow intermittently fails when I tried to run the DAG 19th.
My solution was synced both backend.
Regards
Eduardo

Related

Cannot create docs for components in backstage docker error

I am trying to display docs stored in repository created by backstage io component on backstage-io /docs page UI, but when I am trying to access the docs I am getting the following error
Building a newer version of this documentation failed. Error: "Failed to generate docs from C:\\Users\\Admin\\AppData\\Local\\Temp\\backstage-enprxk into C:\\Users\\Admin\\AppData\\Local\\Temp\\techdocs-tmp-W6iVab; caused by Error: Docker container returned a non-zero exit code (1)"
Files in my repository
docs folder only having index.md
and mkdocs.yml have
nav:
Home: index.md
I was getting similar issues working on a local POC of Backstage. The biggest problem was that I needed to install pip, python, mkdocs, and mkdocs-techdocs-core (i.e. pip3 install mkdocs-techdocs-core). If you have done that and then followed everything in this documentation, then it should start working. Hope that helps. I spent a couple of days trying to get past these types of errors.
For me the above issue is fixed by using below as it was not working inside my container in kubernetes.
I changed app-config.yaml -
techdocs:
builder: 'local' # Alternatives - 'external'
generator:
runIn: 'local' // changed from docker to local here

How do I use this config.yml file to run a web scraper that someone else built?

My end goal: I want to fetch data from a retail site on an hourly schedule to see if a specific product is back in stock or not.
I tried using xpath in python to scrape the site myself, but I'm not too familiar, and why reinvent the wheel if someone built a scraper already? In this case, Diggernaut has a github repo.
https://github.com/Diggernaut/configs/tree/master/bananarepublic.gap.com
I'm using the above github repo to try and run a pre-existing web scraper on the banana republic retail site. All that's included in the folder is a config.yml file. I don't even know where to start to try and run it... I am not familiar with using .yml files at all, barely know my way around a terminal (I can do basic "ls" and "cd" and "brew install", otherwise, no idea).
Help! I have docker and git installed (not that I know how to use docker). I have a Mac version 10.13.6 (High Sierra).
I'm not sure why you're looking at using Docker for this, as the config.yml is designed for use on Diggernaut.com and not as part of a docker container deployment. In fact, there is no docker container for Diggernaut that exists as far as I can see.
On the main Github config page for Diggernaut they list the following instructions:
All configs can be used with Diggernaut service to retrieve products information.
You need to create free account at Diggernaut
Login to your account
Create a project with any name and description you want
Get into your new project by clicking it and create new digger with any name
Then you will see 3 options suggested to you, you need to use one where you will use meta-language
Config editor will open and you can simply copy and paste config code and click on save button.
Switch mode for digger from Debug to Active and then run your digger.
Wait for completion.
Download data.
Schedule your runs if required.

Docker failing to see updated fixtures CSV in rspec test directory

This one is quite strange.
I am running a very typical Docker container that holds a Rails API. Inside this API, I have an endpoint which takes an upload of a CSV and does some things and stuff.
Here is the exact flow:
vim spec/fixtuers/bid_update.csv
# fill it with some data
# now we call the spec that uses this fixture
docker-compose run --rm web bundle exec rspec spec/requests/bids_spec.rb
# and now the csv is loaded and I can see it as plaintext
However, after creating this, I decided to change the content of the CSV. So I do this, adding a column and respective value to it for each piece.
Now, however, when we run our spec again after saving this it has the old version of the CSV. The one originally used at the breakpoint in the spec.
cat'ing out the CSV shows it clearly should have the new content.
Restarting the VM does nothing. The only solution I've found is to docker-machine rm dev and build a new machine (my main one for this is called dev).
I am entirely perplexed as to what could cause this or a simple means to fix it (building with all those images takes a while).
Ideas? Inform me I'm an idiot and I just had to press 0 for an operator and they would have fixed it?
Any help appreciated :)
I think it could be an issue with how virtualbox shares folders with your environment. More information here https://github.com/mitchellh/vagrant/issues/351#issuecomment-1339640

Automate Twitter Bot

I have done a twitter bot using python that posts a tweet about the weather info for a specific city. I test it doing this: python file.py and then I check on my Twitter Account that it works.
But, how can I execute it periodically? Where can I upload my source code? Are there any free server that runs my file.py for free?
Assuming you're running gnu/linux and your machine is online most of the time, you can configure your own crontab to run your script periodically.
check: https://www.freebsd.org/doc/handbook/configtuning-cron.html
If that is not the case,
Check out https://wiki.python.org/moin/FreeHosts for your purpose first from the list should do the job. (https://www.pythonanywhere.com/)
You can host your code file to a github repository, then run your .py file through a Github action which run by a schedule you set up by a .yml file at .github/workflows folder.

Neo4j StartService FAILED 1053 after copying Beer DB example

I installed Neo4j as instructed in the site and was able to install and stat server. However I tried to copy the Beer example DB by stopping the server and deleting the current graph.db in the \data folder and replaced with the one Beer example downloaded from online (graph.db folder). This is the only step I did.
Now the issue is, when i tried to start the Server I get "StartService FAILED 1053"
I am using following command on the powershell in windows - c:\neo4j-community-2.0.0-M03> .\bat\Neo4j.bat start
Can someone please help if I have done anything wrong here.
Thank you!
You are running neo4j 2.0 against an older database file. You'll need to set the config parameter to allow the store to be upgraded before starting. See instructions here:
http://docs.neo4j.org/chunked/milestone/deployment-upgrading.html#_explicit_upgrade

Resources