Should I use Rails for consistency? (for ETL project) - ruby-on-rails

CONTEXT
I'm new to Ruby and all that jazz, but I'm not new to dev.
I'm taking over a project based on 2 rails/puma repositories for web & APIs.
I'm building a new repository for a backend data processing app, using Kiba, that will run through scheduled jobs.
Also, I'm to be joined by other devs later on, so I'd like to make something maintainable by design.
MY QUESTION : Should I use Rails on that ETL project?
Using it means we can apply the same folder structure as the other repos, use RSpec all the same etc. It also appeared to me that Rails changes the way classes like Hash act.
At the same time, it seems to bring unnecessary complexity to a project that will run on CLI and could consist of only a dozen of files.

Kiba author here! This is an important question, thanks for asking it!
MY QUESTION : Should I use Rails on that ETL project?
By default, I would recommend to start with a separate project (like a kind of "macro-service" approach), unless you have important things (more than just RSpec & ENV setup) to reuse from the Rails app.
If there is an important expected coupling between the app and the ETL (e.g. by "scheduled jobs" you mean jobs triggered through Sidekiq, to react to events, or you have classes shared between the 2 projects), then you can place the ETL in a etl subfolder of your Rails app, for instance, to provide a bit of separation and leave the opportunity to split the code out later if it becomes a better path (this is a middle ground I'm using on some projects).
If it is not the case, though, and the data pipeline is expected to become large and live its own life, you can instead split it to its own project.
Using it means we can apply the same folder structure as the other repos, use RSpec all the same etc.
You can use RSpec or minitest from a dedicated ETL (pure Ruby) project too, introduce a notion of ETL_ENV (development, test, production), build your own ENV-based (or file based) configuration with dotenv or similar, and support cron jobs from there too if you need that.
Pure Ruby projects can be structured just like a Rails app, and there is usually less magic (more explicit), which is helpful.
It also appeared to me that Rails changes the way classes like Hash act.
I would actually recommend to use an "explicit" approach about depending about that. Today I prefer to "cherry-pick" the exact extensions I need, at the top of each file (as described here).
One last word, you can test out Kiba ETL pipelines just as much as your individual ETL components, and I would recommend to do so (I will cover that in a future blog post), since it helps moving things around and upgrading Ruby with ease, and generally scale the team of developers easily (CI + tests).
I hope this provides enough guidance for you to take a decision on this, if this is not the case, please comment out!

From my point of view using Rails for ETL projects is an overhead.
Take a look at dry-rb. Using https://dry-rb.org/gems/dry-system/ you can build a small application to process data. Also, there is a gem to build CLI https://dry-rb.org/gems/dry-cli/
Here is a list of all dry gems https://dry-rb.org/gems/

Related

Another rails upgrade dilemma

soliciting advice about upgrading (or re-writing?) a legacy app. It's a single page webapp with lots of dynamically generated windows and forms, roughly comprised of
13,000 lines in .rb files
11,000 lines in .erb files
25,000 lines of javascript (not including large 3rd party libraries that bring this to nearer 60000 lines)
This acts as a UI for end users of our system, which also has a number of core business services (mostly written in Java, with a small amount of Node.js) and a fairly sizeable MySQL database (>200GB). Some of these services push AJAX to the client browser for real-time updates.
Reasons for upgrade
It's ruby 1.8.7, rails 2.3.15. Most of the core code dates from 2009. This makes it both insecure and hard to maintain (think "predates the existence of gemfiles".)
The app has been maintained by Java devs for most of its life, as most of the company's devs have been hired as Java devs to work on all the other services that perform business logic. It's probably safe to assume that this has lead to lots of hacks from people who didn't want to break anything, and certainly lots will not be done in a "rails way".
The javascript is also a bit of a mess. It's got a knot of frameworks (the original Angular is used sporadically; jquery and prototype are both fighting over the $ symbol in different places.) There are files that are 7000 lines long.
the css styling has been upgraded since 2009(!) but is starting to look a little tired. We'd like to implement a bootstrap theme that will look sharp without too much front-end skill, but right now the code that renders all of our pop-up windows, sidebars etc breaks badly if we try and add bootstrap.
It would be nice to modernise our push servers, replacing them with websockets.
Context
There are 3 of us on the dev team- this is my first job, and I've only been here since January. Of the other two, one has only been here about 4 months longer than me, and it's his first dev job too. The other guy is the only one who has ever spoken to someone who spoke to someone from the original team.
Oh, and of course we have little or no test coverage.
Options
When I was hired (as a Java developer), I was told that we were looking to replace the website with one based on Spring MVC. This effort is partially underway, having been attacked in drips and drabs over a couple of years. Because different devs who never met have attacked it as if it's their own brand new project, the same problems are solved in different ways in different places. They've tried some flashy techniques such as custom annotations that I find hard to follow, but as far as I can tell don't fully work. The most senior of us estimates it would take our team a year's dedicated full-time work to finish it (which is not a realistic business proposition, based on how many requests for new features we get from customers).
I'm inclined to upgrade the website rather than spin out a new one. This is partly because I can see the sense in that post. Another reason is that we're all employed as jack-of-all trade full stack developers (doubling as DBAs, sysadmins, etc...). We've got no particular expertise in UI design, and the UI for our present interface, although dated, is pretty user-friendly; it feels like a blank canvas would throw that structure out, and play to our weaknesses. Upgrading ruby/rails would also make any features we add during the upgrade much easier to add to the new site.
Apparently some experienced ruby devs who are friends with my boss have advised him informally it'd be so much work to bring the website up to date that it would be comparable to a complete re-write, which was the motivation for the spring project. This would have the advantage of only having to think about Java + javascript, and not trying to hire people who know both Java and ruby well.
Conventional wisdom seems to be upgrade rails in stages. I'm not sure how well this would work for us, for 2 reasons. For one thing, there 3 major versions for us to upgrade, which might have significant changes between them. More importantly, the code needs some TLC anyway with refactoring and the creation of a test suite.
I'm inclined to follow the following strategy:
Invest some developer time in training, to get a sense of the relevant best practises and the "rails ways" of doing things, rather than the "good enough to hack" knowledge most of us have now.
fire up a new rails 5.1 project on ruby 2.4.0
Configure active record to use our old database
Copy across the javascript from the public folder of the existing project in to the relevant parts of the asset pipeline and save that headache for "phase 2".
Sort out a gemfile with updated versions of our dependencies (for example, mysql gem has been replaced by a mysql2 gem.) Installing rubocop seems like a good idea at this point.
Copy files across from old to new project one at a time. Read the code, figure out what it's doing, write the relevant tests, fix where they break. Use the ruby API and rails upgrade guides to update the syntax. Refactor until rubocop is appeased.
Once we've reproduced the functionality of the existing site, write a new stackoverflow post on how to sort out the javascript ;)
This certainly sounds like a lot of work, but seems less likely to produce a buggy mess than trying to reproduce our existing functionality from scratch in a different language. So...
Questions
Does this strategy seem sensible? Is this a case where the re-write is really a better option? Is tackling the JS separately the best call, or is it better to restructure it as we're examining the calls to it from the views? Or should we really upgrade -> 3.0, 3.1, ... 5.0, 5.1?
We've altered the database manually, adding new tables, new fields and whatnot directly rather than using .rails generate. 'Rails magic' seems to make this work at present, but should we anticipate problems in step 3?
Is there any logical order in which to approach the migration of ruby? As there's major changes to the routing, which is the central entry point of the application, it seems sensible to start there, followed by authentication, then the main page, and then add functions one at a time.
Part of the problem is "not knowing what we don't know" about the rails way of doing things. Apart from the canonical Ruby/Rails tutorials (Hartl, Ruby Monk, Ruby Koans, Kehoe's rails book), is there any essential reading we should be aware of before trying to take on such a large job? I'm especially thinking about things that may not be immediately obvious, like proper use of helper functions, module structure, etc.
Any other advice, comments, prayers, ... welcome!

A seasoned programmer, confused on how to start coding a rails app from scratch

I have been involved with several ruby on rails projects in the past, but I joined in those projects with a completely built rails app, complete with spec tests, factories, models, views, controllers and some custom libraries. I did hundreds of commits to fix assigned tickets in those projects, and I know the Rails MVC architecture well.
But now is the time that I need to create a Rails app from scratch, on my own, and despite my rails experience, I don't have the confidence to start.
My biggest problem is, how to decide the controllers that I will create, what is the purpose of the controllers that I create, the entire design of the web app, there's so many things that is running on my head right now, and I just couldn't sort it out.
Does anyone has the same situation, or have encountered the same problem before like me?
When starting a new application, I usually sketch out (on paper) a really high-level overview of what I think the core models of the app are going to be, and how they relate to one another. Obviously these will likely change as your application evolves during the development process, but it's a good place to start.
Then, from that pool, I identify the model which is most 'core' to the application's purpose - and I start by generating that with rails g model ModelName. For example: I was recently working on a hotel directory and started with the Hotel model.
Then I apply some pretty basic TDD methodologies and start writing the spec for the unit test for that model, run the tests, build, write more spec, refactor, etc. As bits fall into place, do a commit and move on. That way, if you go down a wrong path, it's easy to get back to a healthy spot again and you don't have to worry (as much) about making mistakes.
I usually find that by starting with the unit tests for the core models, the rest of your application will naturally evolve out. Once you decide to move onto functional and integration tests you should have a really clear idea about how all the pieces fit together and how your users should 'flow' through your application. This should naturally lead you to develop controllers that fit your different user's scenarios.
Above all, practicing is the best way to get used to starting from scratch. Build some basic apps with really clear outcomes, like a blog or simple scheduling tool. It will help you get used to the process so when you move onto bigger or more abstract applications - you'll be more adept at getting off the ground.
I would start with a tutorial that has you put a small system together from scratch like this:
http://guides.rubyonrails.org/getting_started.html
If you already have everything set up, you can skip to 3.2
Also, make sure you're using version control software and branch/revert often if you don't like the direction it's going (I recommend a distributed version control system like git as they often have better branching/merging)

How to organize a Rails App

For the first time I'm creating a quite complex Rails app.
I'd like to know what's the best way to organize that app by folders. Until now, I'd do everything under one app (all the models, controllers, etC) but reading some open source code I realize that they put everything under different apps.
Like for example Spree Commerce. They have a general folder and inside that they have different apps (API, core, admin, etc). How is that done and is that the best way to do it?
I'd like to get pointed to the best way to do it (a book, blog, anything) so I can understand how I can architect my app for future maintenance.
thank you
As an aside I think the title of your question is a little confusing. Rails, by using convention over configuration, defines 'how to organise a Rails app'. I think your question is rather about how to architect your application as opposed to anything Rails-specific. Maybe tweak the title?
That aside, without knowing any more detail about your project it's a tricky question to answer, but I'll give it a go.
All applications should start off simple, if you believe (like I do) that you should start by building the simplest thing that could possibly work. Given this, since you're using Rails, then in all likelihood the simplest thing would be to structure your app as a vanilla Rails 3 application. This will probably (I say 'probably' because I don't know any specifics about the app) allow you to get a beta version of your app up and running pretty quickly without worrying about complexities which at this stage in the development of your project are not a problem.
If you need to create an XML or JSON-based API then Rails makes this really easy using the standard framework, which will allow you to spend more time thinking about the API design than how to code it, and it's the API design which is the most important thing to get right in the first instance.
Similarly, your Admin site can be part of the same app just in a different namespace. If you find later down the line that you want it as a separate app, you can do this (maybe you could use the awesome API you designed to facilitate this), but why bother designing it with this added complexity (and hence extended development time) in the first place if you don't have a good reason for doing so?
Once you have your app up and running and people are starting to use it, you start to get a picture of where the bottlenecks are and where the design could be improved. At this stage, if there's a need, you can start to move parts of the app to scalable solutions, such as running your API as a standalone service, introducing caching, changing data stores and other improvements and optimisations.
Even if your app is as wildly successful (and I hope it is!) then re-architecting your application whist continuing to run the existing service is still entirely possible, as Twitter have proved. Just stick to Knuth's statement and you'll be alright.
Regarding reading material, that's a tricky one. For me a lot of the XP and agile development classics taught me a huge amount about how to approach program and app design. I'd also check this StackOverflow topic for book inspiration.
Good luck!
Spree uses Rails' Railties (Rails::Engines). Railties are introduced in Rails 3 to make it more modular and easy to extend. Rails 3 itself is a collection of Railties (ActiveSupport, ActiveModel, ActiveRecord, etc.).
If you are developing a complex app I would suggest spending some time planing its' architecture. Designing a complex app without any initial planning would definitely end with a maintenance nightmare down the road. It also introduces a huge learning curve for the new team members, slow down your new feature introduction and of course, frustration.
Anyway, don't over optimize, but don't forget to design your architecture for your needs.
IMHO, I will create very complex projects as one app. I have reason to believe that Spree and Radiant build under seperate apps so that under the pretense of their open source communities, contributors can contribute code easily without tampering with the core data, and the core workings of the application.
Otherwise, you should be alright just building it as one app. Just keep it neat.
Here is what have kept me sane for several years of RoR development:
I use Rails Engines, but keep them in same codebase as the main app. Here is good starter for modular Rails app:
https://github.com/shageman/the_next_big_thing
Wherever I can I try to reduce coupling and use composition to make things easily testable, reusable and maintainable. This helps to eventually extract module or engine as separate gem. Composition is done by routes (mounting), directory overlaying (assets), dependency injection or configuration.
If I don't need to re-use an engine I put it in the same code base as the main app which is single deployment unit. Thanks to that I don't need to switch between projects in my IDE. While in development environment any changes to the engine code are instantly picked up by Rails reload mechanism.

Whether or not to modularize with Rails plugins?

I'm working on application which can be logically grouped into a core engine, and business domain modules. The business domain modules essentially encapsulate code which is specific to our customers' businesses. We initially separated this out using the root rails structure for our core engine, and having all customer code in separate plugins.
But we've run into various problems with this approach, most of which can probably be put down to Rails class reloading in the development environment. While we have managed to get reloading largely working, we've run into weird Rails bugs with partially unloaded classes combined with the Rails.cache.
What I would like to know is, are we abusing the intended usage pattern for Rails plugins? Was packaging up aspects of our application as plugins the right move? And is there a better way to do it? Or should we rather soldier on and try sort out these remaining issues?
We're currently moving toward rewriting the plugins as modules within the root rails structure, but I must confess I rather like the elegance of plugin mini-application directory structure.
Brendon McLean.
My large app includes several plugins that are private to the app. I agree that plugins can nicely isolate sets of functionality.
I haven't run into the loading problem that you describe since I turned off the dynamic class reloading in dev mode.
Why not have dynamic class reloading? It seemed to slow things down too much. Easier to cntrl-c and restart the test mongrel when needed. After all, I usually run a given iteration of the code for more than one html request/reply cycle before making further changes.
Your original plugin architecture sounds like it was solving several problems for you. I'd first try to change the tool (turn off dynamic reloads) before changing the software architecture.

Running multiple sites from the same rails codebase?

I have a client that wants to take their Rails app that has been successful in one niche and apply it to another similar niche. This new instance of the app is going to start out very similar: all the same functionality, different logo and colors. However, if the new site is successful it will inevitably need significant customizations that shouldn't be applied to the original site. At the same time, if bugs are fixed and improvements are made to one app, then both apps should be able to share those improvements.
Can anyone suggest strategies or resources that address this issue? How do I keep changes that apply to both apps from taking significantly longer to test and implement?
Yes, I know the answer involves SCM, plugins, gems, and Rails engines. These tools will and are being used. but I want to know when and how to use these tools towards solving this problem.
Links are also welcome.
This question is not the same as:
Multiple websites running on same codebase?
In my question, I'm not running the exact same app with different settings.
How do you sync changes between multiple codebases? I'm asking a similar question, but I'm specifically asking about Rails apps.
We currently work with a setup quite similar with what you are describing.
We started developing a somewhat big Rails app (sales, stock management, product catalogue, etc) for a client. After finishing it, there came several new requests for almost identical functionality.
The original app, however, had to keep being maintained, adding new features, correcting bugs and whatnot.
The extended ones needed to maintain most functionality, but change appearance and looks.
What we did was follow a series of steps:
First we started cleaning up the code, pulling hardcode references to tables, reducing and optimizing queries, looking up missing indexes and ways to improve our ActiveRecord use
After being somewhat satisfied, we started developing missing tests. I can't stress hard enough why it's useful, since we'll be maintaining a same codebase for several apps, and need the core functionality to be as protected as it can be from new changes.
That was also the magic word: core functionality. We started selecting base functionality that could be reused, and extrating all generic code. That gave us a mix of controllers, models and views, which we started to change into modules, plugins and gems.
What goes where? Depends greatly on your code. As a rule of thumb, functionality that doesn't deal with the domain language goes to plugins (or gems if it doesn't depends too much on Rails)
This approach led us to a several of plugins, gems which we then pulled together reassembling the original project, and then it got to it's own GIT repository. That way, we had a main "template" repository which glued all the components and several other GIT repositories for each of them.
Finally, we develop an easy theme system (basically loading /stylesheets/themes/:theme_name/ and getting theme_name from the DB). Since it's an intranet project, we could almost do anything with proper CSS styling. I'd guess for working with IE you'd need a more complex approach.
Then, we just used that main repository developing the new functionality on top of it.
Now, how do we deal with changes to the core base. We start with our template repository. We fix or define where the fix or change should be and either change it there or on it's corresponding gem/plugin. After properly testing it, we deploy it to our GitHub account.
Finally, we merge/rebase the other projects from that template repository, getting the new updates.
Sounds a bit complicated, but it was only for the setup. The current workflow is quite simple and easy, with the given advantage of working with several developers without bigger issues.
With minimal touching of the main site, it might be possible to use the Ruby code from it while extending the templates and changing the styles. I have worked on that extensively in Django and the layout can look like:
project/
sites/
site_one/
templates/
models.py
settings.py
urls.py
views.py
site_two/
templates/
models.py
settings.py
urls.py
views.py
base_app/
settings.py
You could try do something similar in Rails:
main_webapp/
app/
config/
...
sites/
site_one/
controllers/
models/
views/
site_two/
controllers/
models/
views/
Assuming the functionalities are identical across sites but they just have different layout and styles, there will be none or very little model and controller code. Should you wish to add more functionality to specific sites, just stick the code under the desired site folder.
Django also have the concept of Sites and the ability to look for templates in one specific project folder and an app folder. You could try to copy those features and bring them over to Rails to achieve running multiple site from one codebase.
I recognize that you are looking for a Rails solution but you can still checkout how it's done in Django and copy some of the useful features to the other side. If I like a Rails specific feature, I'll port it other to Django/Python.
We're doing something similar at my company. Except its currently involving multiple environments (production, test, development). We're using SVN as our SCM to keep our code straight and lets us duplicate the current stable environment and create separate versions of an application (and potentially changing things like the logos or certain functionality). I highly highly recommend running the environment with Apache/Nginx and Phusion's Passenger. This is letting us run all of these applications separately, on the same/similar codebase(s). And that's it. We have to DBs, one Production and one Development to keep our live data separate, but you can easily connect two app instances to the same db this way. Its worked out really well for us so far in being able to develop, test and deploy multiple web applcations without taking down the primary production server.
I know this could be possible using Git Submodules

Resources