I want to create reports for sequential, predetermined periods.
Essentially, I want to be able to:
Set a time period, for example from the 10th of one month to the 9th of the next. Then I want to be able to run a report and have the current period attached to the report, which in this example would be August 10, 2011 to September 9, 2011. Then suppose 6+ months later I run the report again, even though it's 6+ months later the period should be September 10, 2011 to October 9, 2011.
I've thought of creating a period model that would have 'begin', 'end', and 'current' fields. The 'begin' and 'end' fields would hold the numeric day values, i.e. continuing with the above example, 10 and 9 respectively. The 'current' field would hold the current end period date, which (using the above example) would be September 9, 2011. With the 'current' period and the 'beginning' and 'end' values I could then create the next logical period on demand. Further, I'd also have the opportunity to modify the period as needed.
While the above approach should work, it doesn't seem that efficient of an approach. Are there alternative approaches that are better? What can I do, if anything, to improve my approach?
Thanks.
There is an ideal technique to solve your problem. It's called test-driven development. It's ideal in this case since you can describe the solution without having to code it. Just by describing the solution, you develop many clues about what you should actually be coding.
Since you might really have no idea what code to write yet, Cucumber seems ideal in your case. You can describe the problem you want to solve in English, run that using Cucumber, and it will actually point you all the way towards the solution.
If you have some idea about what code you want to write, you might be better off using Test::Unit or RSpec directly. These methods are closer to the actual code: the problem you want to solve needs to be written in Ruby.
Pragmatic Programmer has excellent books about these two techniques.
Since the problem you've described is of particular value only to you, you will be hard pressed to find many actual implementations on StackOverflow I think. Using cucumber to describe the problem for yourself seems the best way to go forward.
Related
Pardon if this question is not appropriate. It is kind of specific and I am not asking for actual code but moreso guidance on whether or not this task is worth undertaking. If this is not the place, please close the question and kindly point me in the correct direction.
Short background: I have always been interested in tinkering. I used to play with partitions and OS X scripts when I was younger, eventually reaching basic-level "general programming" aptitude before my father prohibited my computer usage. I am now going to law school and working at a law firm but I love development and I want to implement more tech innovation in the field.
Main point: At our firm, we have a busy season every year from mid march to the first week of april (immigration + H1B deadline). We receive a lot of documents and scanned files that need to be verified, organized, and checked.
I added (very) simple lines of code to our online platform to help in organization; basically, I attached tags to all incoming documents, and once they were verified, the code would organize them by tag (like "identification doc", "work experience doc" etc.). This would my life much easier every year, as I end up working 100+ hour weeks this season.
I want to take this many steps further with an algorithm that can check for signatures and data mismatches between documents and ultimately organize the documents so they are ready to print. Eventually, I would like to maybe even implement machine learning and a very basic neural network to automate the whole mind-numbing and painful process...
Actual Question(s): I just wanted to know the best way for me to proceed or get started. I know a decent amount of python and java, and we have an online platform already with the documents. What other resources would you recommend in terms of books, videos, or even classes? Is there a name for this kind of basic categorization? Can I build something like this through my own effort without an advanced degree?
Stupid and over-dramatic epilogue: Truth be told, a part of me feels like I wasted my life thus far by not pursuing what I knew I loved at the age of 12. This is my way of making amends I guess, and if I can do this then maybe I can keep doing it in law and beyond...
You don't give many specifics about the task but if you have a finite number of forms in digital form as images, then this seems very possible.
I have personally used OpenCV with Python a lot and more complex machine learning tasks have become increasingly simple in the past 10 years.
Take for example object detection (e.g. 1, 2) to check whether there is anything in a signature field or try extracting the date from an image (e.g 1, 2).
I would suggest you start with the simplest thing that would improve your work. A small and easy task will let you build up your knowledge on how to do things.
I was going through my mails, and saw that gmail automatically suggested me to add coming friday around 5pm to an event on 21st Feb. I am surprised how gmail does this ?
I mean how did it correctly figure out that this friday meant the coming friday, and also that the 5 PM is linked with Friday.
I am a newbie in NLP and machine learning, so if someone can explain it to me in layman terms I would be very glad
I don't think this needs a lot of machine learning as such. A bit of NLP is helpful to get the dependencies from the sentence but even that isn't strictly necessary.
You could start off with just looking at keywords monday,tuesday etc. and then do a look around to see what is around them last monday, next monday, coming monday, previous monday and so on. These are called window features because they provide a window +/- 1,2,3 ... around the feature you are interested in monday. The around 5pm you could theoretically also get from just looking at window features, I don't have an intuition as to how noisy that would be. Try to think of all the ways of expressing time in that context and then think of those ways can be mixed up with something else. Of the top of my head it would seem relatively easy to do that.
Anyhow, the other way is to use a dependency parser to extract the grammatical relations of the elements in the sentence. This requires you to Part of Speech (POS) tag the sentence (after splitting it into tokens). The POS tagger would need to be trained to recognize that friday and monday are nouns, perhaps even that they are temporal expressions, same goes for 5pm and around 5pm. That does require machine learning and a lot of it. The benefit Google has as opposed to others is that they have a lot of data, which allows them to have lots and lots and lots of examples of different ways expressing what essentially is the same thing. This gives their models a lot of breadth. Once you've got the sentence POS tagged, you feed it to a dependency parser (such as the Stanford Dependency Parser) which tells you what the relation between all the different tokens in the sentence is.
Again Google has a lot of data which helps. On top of all this Google has had years to hone the output of the models so that when the models isn't entirely sure what is going on it won't highlight/extract the result. In terms of actually applying NLP in the real world this last step is very important because it given people confidence in what the system is doing. Basically if the software isn't sure what is happening do nothing, because doing something risks doing the wrong thing which then reduces people's confidence on the system as a whole.
Releasing a reliable easy to use NLP application requires a tradeoff between the quality of the NLP/Machine Learning and general software engineering to hide all the parts where the NLP fails from the users.
Try sending yourself email(s) with time expressed in different ways and see which ones Google gets and which ones it doesn't. For instance
Can we meet Friday next week?
How about coffee next week's Friday at 2pm
I can't do Friday but I can meet Wednesday at 4pm
and so on, it's always interesting to poke holes in technology. It can also reveal quite a lot about what it is doing, and how it is doing it.
Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 8 years ago.
Improve this question
I have joined a rails project as a contractor. The project has been going for more than a year. The code is written by about 10 different developers and most of them are contractors as well. They have different code style. Some of them came from Java. The code has horrible scores with metric_fu. Many functions are very long (100 - 300 lines). Some functions have insane amount of logical branches, loops, and recursions. Each request generates a ton of sql queries. Performance is very bad. Many obsolete code that are never used but never got the chance to be cleaned up. The core architecture is plain wrong or over engineered. Code coverage is only about 25%. Views and partials are chaotic and terrible to read and understand.
The manager is in a position trying to satisfy the CEO by continuously adding new features, however newer features are increasingly hard to get implemented correctly without breaking something else. He knows the code is bad, but doesn't want to put too much effort in fixing them as refactoring will take too long.
As a contractor / developer, what is a good way to clear this situation and convenience the Manager or CEO to partition some time for refactoring?
Related Questions
How can I convince skeptical management and colleagues to allow refactoring of awful code?
How to refactor on a budget
Dealing with illogical managers
In my limited experiance:
It's impossible to convince a manager that it's necessary to set aside time to refactor. You can make him aware of it, and reinforce the point every time that you run into an issue because of bad code. Then just move on. Hopefully your boss will figure it out.
It's quite common to get in on a running project and think "this is total junk". Give it some time. You might begin to see a pattern in the madness.
I've been in similar situation. There are basically only two options:
You get some relaxed time and you may be granted time to refactor something
Due to the bad code further development of some component comes to a stall. You can't proceed to add anything because every little change causes everything else to stop functioning. In this emergency case you will get a "go" with refactoring.
I have just answered in some other question, my horror story:
https://stackoverflow.com/questions/1333077/dirty-coding-tricks-to-deliver-project-on-time/1333095#1333095
I have worked on a project where dirty tricks were the main driving principle of the development. Needlees to say, after some time these tricks have started to conflict with each other. In one analytics component, we had to implement the other very dirty trick - to hide away those calculated values which due to the conflicting tricks were not calculated properly. Afterward, the second level tricks started to conflict and we had to create tricks to deal with those. Ever since, even the mentioning of this component makes me feel horror that I may have to work on it again.
It's exactly the second situation where refactoring is the only way out.
In general, many managers without a technical background (actually, those who come from bad programmers as well) neither care nor understand the value of quality code and good architecture. You can't make them listen until something interrupts their plans, like a blow of "non-implementable" features, increasing and reoccurring bugs, customer requests that cannot be satisfied and so on. Only then understanding of the code problems may come for the first time. Usually, it's too late by then.
Refactoring code that sucks is part of coding, so you don't need to get anybody's approval unless your manager is watching your code and or hours VERY closely. The time I save refactoring today is time that I don't have to bill doing mad tricks to get normal code to work tomorrow (so it works out, in the end).
Busting up methods into smaller methods and deleting methods that are not used is part of your job. Reducing DB calls, in code that you call, is also necessary so that your code doesn't suck. Again, not really refactoring, just normal coding.
Convincing your manager depends on other factors, including (but not limited to) their willingness to be convinced, and your ability to convince.
Anyway, what is massive refactoring in RoR? Even if the "core architecture is just plain wrong," it can usually be straightened out a bit at a time. Make sure you break it into chunks /use branches so you don't break anything while you're busy fixing.
If this is impossible, then you come back to the social question of how to convince your manager. That's a simple question of figuring out what his/her buttons are, and pushing them without getting fired nor arrested. Shaming, withholding food, giving prizes, being a friend, anonymous kidnapping threats where you step in and save the day... It's pretty simple, really: creativity is the key!
Everyone is missing a point here:
Refactoring is part of the software development life cycle.
this is not only a RoR or any specific project but any other software development project.
If somehow you could convince your PM why it is important to refactor the existing code base before adding any new feature, you're done. You should clearly tell your PM that any further addition of new feature without any refactoring will take more time than required. And even if the feature is added, somehow, bug resolving sessions will take even more time since the code is very bloat and unmaintainable.
I really don't understand why people forget the principle of optimise later. Optimising later also includes refactor later IMHO.
One more thing, when taking design decisions, you should tell the consequences, good or bad, to your PM very very clearly.
You can create a different branch(I assume you are using git) for refactoring and start adding new feature in some other branch if your PM insist on adding new feature along with refactoring.
A tricky one, i have recently worked in such a company... they were always pushing for new things, again they knew it was bad, but no matter how hard I pushed it - i even got external consultants in to verify my findings - they seen it as a waste of time.
Eventually they seen the light... it only took multiple server crashes and at one point almost a full 8 days of no website to convince them.
Even at that they insisted it 'must' be the hosting service.
The key is to try and quantify how long their site will last before it crunches, and get some external verification to back you up - 'they' always trust outsiders who know nothing about your app! Also, try - if you can - to give a plan that involves gradual replacement at worst, and a plan for how long it would take to do that way. Also a plan for if 1 or 2 bodies were working on a complete rewrite hwo long it might take - but be realistic too or it will bit you in the bum! If you go that route (which is what we done) you can still have some work on the existing site as long as you incorporate it into the new.
I would suggest that you put focus on things that they can see for themselves, that is, they will surely notice that the application is slow in some functionalities, so pick up one of them and say something like "I can reduce the waiting time here, can I take some time to improve this specific thing?" (more well said, but you got the point :P).
Also consider that 10 developers before you did not refactor the code base, this may mean that it is a monstruos task, likely to make the situation worse, in this situation if something will go wrong after the refactoring it will be your fault if the program does not work properly anymore.
Just a though, but worth considering.
I'd take one small chunk of application and refactored and optimized it 'till it shines (and I'd do it in my personal time in order not to annoy my manager). Then you'll be able to show your manager/CEO the good results of refactoring and SQL optimizations.
If there is a need to refactor then the code will speak for itself. Minor refactoring can continue in during development. If you cant convince the manager then probably you should rethink if its necessary at all.
However if it is absolutely necessary then constructing metrics of development activities and the benefits should convince the manager.
I think one of options would be to highlight to the manager how re-factoring the code base now will save time (i.e. money) in the long-run. If the project is expecting to be running long term then making the changes now will clearly save you and other developers time in the future.
Best to use an example of a feature you've worked on estimating how long it would have taken if you had the cleaner code to work from in the first place. Good luck!
I am in same position right now, but with an agreement with the manager that, when the new feature should be implemented in some existing module to re-factor the module too (if it needs re-factoring), we are struggling now with the code created 4-5 years ago and definitely I find out that the re-factoring someone else s code is not trivial nor amusing to do, but very very helpful for the future re-use.
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 6 years ago.
Improve this question
I recently joined a new company with a large existing codebase. Most of my experience has been in developing small-medium sized apps which I was involved with since the beginning.
Does anyone have any useful tools or advice on how to begin working on a large app?
Do I start at the models and move to controllers? Are there scripts to visually map out model relationships? Should I jump in, start working on it and learn the apps structure as I go?
Railroad should help you understand the big picture. It generates diagrams for your models and controllers.
I found unit tests to be the most efficient, effective and powerful tool. So, before making any change, make sure your application has a minimum LOC so that you won't break any existing feature working on.
You should care about unit tests (of course I'm talking about unit/functional/integrational tests) because:
they ensure you won't break any existing feature
they describe the code so that you won't need tons of comments everywhere to understand why that code fragment acts in that way
having test you'll spent less time debugging and more time coding
When you have tests, you can start refactoring your app.
Here's a list of some tools I usually work with:
Rack::Bug
New Relic
You might want to view some of the wonderful Gregg's videos about Scaling Rails to get more powerful tools.
Also, don't forget to immediately start tracking how your application is performing and whether it is raising exceptions. You can use one of the following tools
Hoptoad
Exceptional
If you need to fix some bug, don't forget to reproduce the issue with a test first, then fix the bug.
Not specific on Rails, but I would start reading the requirements and architecture documentation. After that get familiar with the domain by sketching the models and their relationship on a big sheet of paper.
Then move on to the controllers (maybe look at the routes first).
The views should not contain that much information, I guess you can pretty much skip them.
If you still need to know more, the log of the versioning system (given they use one) is also a good place to get to know how the project evolved.
When I've been in this situation, I try one of three things:
Read all the code top to bottom. This lets you see what code is working, and you can report progress easily (I read through all the view code this week). This means you spend time on things that may not be helpful (unused code) but you get a taste of everything that is there. This is very boring.
Start at the beginning and go to the end. From the login page or splash screen, start looking at that code, then the next page, then the next page. Look at the view, controller, and database code. This takes some time, but it gives you the context for why you need that code or database table. And it allows you do see most often the ones that get used in the most places. This is more interesting.
Start fixing bugs. This has the benefit of showing progress on your new project (happy boss) taking work from other people (happy co-workers) and learning at the same time (happy developer). It provides the context of number 2, and you can skip rarely used code from number 1. This is the most interesting way for me.
Also, keep track of what you've learned. Get a cheap spiral-bound notebook and write down an outline of what you've learned. Imagine yourself giving a talk on the code you're learning about or bug you're fixing. Take enough notes to give that talk, and spice it up with a factoid or two to make it interesting. I give my notebooks dignity and purpose by calling them "Engineering Notebooks", put a title on the front (my name, company, date), and bringing them to every meeting. It looks very professional compared to the guys who don't show up with paper to take notes. For this, don't use a wiki. It can't be relied upon, and I spend a week playing with the wiki instead of learning.
As mentioned above, being the new guy is a good chance to do the things nobody ever got around to like unit tests, documenting processes, or automating running tests. Document the process to set up a new box and get all the software installed to be productive. Take an old box under someone's desk and put a continuous integration install on it, and have it email you when the tests fail. You can forward that email whenever someone else checks in code that breaks the tests. Start writing tests to figure out how things work together, esp. if there aren't any/very many tests.
Try to ask lots of questions in one-on-one situations. This shows you're engaged and learning, and it helps you find the experts in the different parts of the app. On a big project you may need to go to one person for one topic and a different person for other topics. It also helps identify who thinks they know more than they really do or who talks more than you really need.
In FogBugz 6, how do I represent the concepts of a "feature" versus a "task"? As defined by Joel Spolsky, the owner of Fog Creek Software (which makes FogBugz), a feature is essentially a user-visible capability. To estimate the time to implement a feature, the developer should break the implementation into short tasks (2 days max) to ensure they think about each step.
FogBugz has only cases. I can't tell whether they're supposed to correspond to features or tasks. Some FogBugz documentation indicates that each case is a task, which is fine except there is no way to group all the tasks for a given feature together. This is especially odd given that, before FogBugz 6, Joel advocated using a spreadsheet with that grouped all the tasks for each feature. But his own software doesn't appear to meaningfully support that grouping.
I realize that the Joel article I reference includes a disclaimer pointing to a later article. However, the later article does not settle this issue, in fact it doesn't discuss features versus tasks at all, which is surprising given how well Joel advocates for those concepts in the first article.
For FogBugz 6.0 and earlier:
Make a case for each work item (task). FogBugz calls them "Features," only to distinguish them from bugs, but you do want one case for each task.
The best way to group a bunch of tasks is to make a Release (Fix-For) and assign all of the tasks to that release.
Responding to AviD's comment/question to Joel:
So, if you have 10 new features coming
in the next version, with each feature
needing 5 tasks to implement, you
recommend creating 10 releases? And
how do I define that these are the
features/"releases" that are to be
included in the upcoming release?
Here is how we dealt with this specific problem in our development process:
First, we made a regular release schedule: monthly internal releases and quarterly external releases. This schedule never changes but task assignment / feature completion does. This is hugely important in terms of simplifying our inter-human communication: don't try to argue with the calendar.
Major features ("10 new features" in your example) are turned into cases (e.g., case 101 to case 110).
Each task that is a sub-component of a major feature also gets created as a sub-case with a description of what makes this chunk of work an important part of the larger picture. Previously, in Fogbugz 6, we used the "See also" feature by allowing it to search the text for us ("This is a sub-component of case 101" for example). This was effectively the same thing but less aesthetic.
Now that we've broken down the work to its finest level of usefulness, we bring the actual developers into the discussion. Each task and major feature is individually assigned to a particular developer.
The developer determines when they can get their assigned work done by picking the appropriate internal release date that they think they can commit to.
At this point, we have a rough sketch of what will get done for each release. Further refinements continue as the working people actually estimate the hours that they'll need to do the work, enabling evidence-based scheduling, etc.
For AviD's question, though, he would have the release-assignment problem solved by step 5 above.
However, I think point 6 is the most interesting as that's where you really get a solid schedule. For example, if developers are having trouble estimating a larger task, they break it down into sub-cases even further. Notice how my assessment of "finest level of usefulness" can differ (perhaps greatly) from the person who really needs to get the work done.
This is also a time when a developer can reach out to someone else and say "I can do most of this but it would really help if person X could help me with this little piece Y." This is actually where I get most of my development tasking: I personally sit in multiple places during this process, from large-scale planning meetings to little fiddly tasks that no-one else has time to do.
PS: Making it a personal goal to get this answer rated higher than Joel's.... ;-)
PPS: My original response is now overcome by events since Fogbugz 7 has lovely sub-tasks. Program managers love those reports.
You may have better luck asking your questions in the FogBugz Discussion Forum
We use a combination of projects in order to accomplish your grouping goals. We also commonly setup a project "parking" Wiki where links to development cases, technical documentation, systems requirements, user documentation, external links to resources etc. can all be placed. It provides a good "one-stop-shop" for everything related to that project.
As part of that Wiki, we would then setup two specific projects. One in relation to the large overall goals/outlines similar to what might correspond to your Project Management charts/whatnot. One in relation to the development tasks of each feature as they are broken down into the smaller and more manageable chunks. You can then, as was mentioned use case linking to both reference the "master" cases in the other project as well as reference the project Wiki itself so that you can quickly and easily get back to all of your project related information which is conveniently in one spot.
You can accomplish a pile of different organizational structures using FogBugz, you just have to approach things a little differently sometimes in order to hit each and every situation.
Hope that helps.
haha, that article has a disclaimer, but I understand what you are saying.
We use Fogbugz and the only 'Feature' that I am aware of is under category and I don't think you can associated it with sub-tasks.
You can type in 'Case N' is the feature for this task if you just wanted to reference it in the case text.
That kind of stuff sound like is lies more in the project management domain instead of software used to track bugs.
thats a good question, i have asked that myself, too..
we currently test-drive fogbugz for 45 days in a group of 5 developers, and we currently create a "release" for major features. in fact we do not release it, but multiple releases together when something is ready.
there should definately be some sort of advanced task grouping in fogbugz.