I was planing to make two web pages (different domains) which deal with similar subject. On the first page there would be published articles and I would like to show those articles on the other page also (here would be displayed for example only last 10 articles). What is the best way to realize this?
EDIT: I use php/mysql
You should store your articles in a database which is available from both pages (are they on the same webserver?)
Then on one page you could do this:
SELECT title, summary FROM articles ORDER BY date DESC
and on the other:
SELECT title, fulltext FROM articles ORDER BY date DESC LIMIT 10
You can serve both web pages from the same webserver even if the domain names are different.
Sounds like you're not "linking" the two pages together, you're presenting two different views of the same data - the first page shows the full articles, the second page shows perhaps titles only of the last 10 articles.
If both sites don't have access to the same database, you have to provide some kind of API for your first site that exports the last 10 articles in XML, JSON, whatever and include this into your second site.
If you don't have the possibility to use the same database from the 2 different sites, you could also create a rss feed (or similar) of the 10 last articles, and use that to display the articles on the other site!
Related
I'm building a large news site and we'll have several thousand articles. So far we have over 20,000. We plan on having a main menu which contains links which will display articles based on those criteria. Therefore, clicking "baking" will show all articles related to "baking", and "baking/cakes" will show everything related to cakes.
Right now, we're weighing whether or not to use hierarchical URLs for each article. If I'm on the "baking/cakes" page, and I click an article that says "Chocolate Raspberry Cake", would it be best to put that article at a specific, hierarchical URL like this:
website.com/baking/cakes/chocolate-raspberry-cake
or a generic, flat one like this:
website.com/articles/chocolate-raspberry-cake
What are the pros and cons of doing each? I can think of cases for each approach, but I'm wondering what you think.
Thanks!
It really depends on the structure of your site. There's no one correct answer for every site.
That being said, here's my recommendation for a news site: instead of embedding the category in the URL, embed the date. For example: website.com/article/2016/11/18/chocolate-raspberry-cake or even website.com/2016/11/18/chocolate-raspberry-cake. This allows you to write about Chocolate Raspberry Cake more than once, as long as you don't do it on the same day. When I'm browsing news I find it helpful to identify the date an article was written as quickly as possible; embedding it in the URL is very helpful.
Hierarchical URLs based on categories lock you into a single category for each article, which may be too limiting. There may be articles which fit multiple categories. If you've set up your site to require each article to have a single primary category, then this may not be an issue for you.
Hierarchical URLs based on categories can also be problematic if any of the categories ever change. For example, in the case of typos, changes to pluralization, a new term coming into vogue and replacing an existing term, or even just a change in wording (e.g. "baking" could become "baked goods"). The terms as they existed when you created the article will be forever immortalized in your URL structure, unless you retroactively change them all (invalidating old links, so make sure to use Drupal's Redirect module).
If embedding the date in the URL is not an option, then my second choice would be the flat URL structure because it will give you URLs which are shorter and easier to remember. I would recommend using "article" instead of "articles" in the URL because it saves you a character.
Say I have a series of posts which are paged starting with the newest posts on the first page, and ending with the oldest posts.
I'm trying to implement the paging in such a way as to allow for the following two things:
i) When someone copies the url and sends it to someone else, they will get the same results, even if it is 6 months later and there have been hundreds of posts added in the interim.
ii) When search engines index the content, links from search results will bring back the content that was indexed, even if the content that was on page 3 when it was indexed is now on page 7.
So to try and explain clearly. If I were to implement paging in the simplest way, I might have URL's looking something like this:
www.foobar.com/foo?page=7
but if someone takes that link and sends it to someone, page 7 could well have completely different content by the time the person comes to look at it. Likewise, the content that was indexed by a search engine would quickly become out of date.
I thought of having an id identify the first post in that particular page in the URL, instead of the page number. But then I run into issues of how to do the paging when someone comes in from that link, it becomes a bit problematic.
Or perhaps I should just forget these issues, provide permalinks to the posts themselves, and if users send the url to people instead of the permalinks then that's their lookout. I would prefer to cover these two scenarios though, if there is a neat way to do it.
Any help much appreciated.
We've looked at the way other people deal with this, and that is to allow people to navigate through posts by month and year rather than by page.
So instead of the navigation having page 1, page 2 etc...
We will have June 2013, May 2013
and the URL will look something like:
www.foobar.com/foo/2013/6
This is what wordpress/blogger seem to do.
On the webmaster's Q and A site, I asked the following:
https://webmasters.stackexchange.com/questions/42730/how-does-indeed-com-make-it-to-the-top-of-every-single-search-for-every-single-c
But, I would like a little more information about this from a development perspective.
If you search Google for anything job related, for example, Gastonia Jobs (City + jobs), then, in addition to their search results dominating the first page of Google, you get a URL structure back that looks like this:
indeed.com/l-Gastonia,-NC-jobs.html
I am assumming that the L stands for location in the URL structure. If you do a search for an industry related job, or a job with a specific company name, you will get back something like the following (Microsoft jobs):
indeed.com/q-Microsoft-jobs.html
With just over 40,000 cities in the USA I thought, ok, maybe it's possible they looped through them and created a page for every single one. That would not be hard for a computer. But then obviously the site is dynamic as each of those pages has 10000s of results and paginated by 10. The q above obviously stands for query. The locations I can understand, but they cannot possibly have created a web page for every single query combination, could they?
Ok, it gets a tad weirder. I wanted to see if they had a sitemap, so I typed into Google "indeed.com sitemap.xml" I got the response:
indeed.com/q-Sitemap-xml-jobs.html
.. again, I searched for "indeed.com url structure" and, as I mentioned in the other post on webmasters, I got back:
indeed.com/q-change-url-structure-l-Arkansas.html
Is indeed.com somehow using programming to create a webpage on the fly based on my search input into google? If they are not, how are they able to have a static page for millions and millions and millions possible query combinations, have them dynamically paginate, and then have all of those dominate google's first page of results (albeit that very last question may be best for the webmasters QA)?
Does the javascript in the page somehow interact with the URL
It's most likely not a bunch of pages. The "actual" page might be http://indeed.com/?referrer=google&searchterm=jobs%20in%20washington. The site then cleverly produces a human readable URL using URL rewrite, fetches jobs in the database that matches the query, and voĆla...
I could be dead wrong of course. Truth be told, the technical aspect of it can probably be solved in a multitude of ways. Every time a job is added to the site, all pages that need to be done to match that job, might be created, thus producing an enormous amount of pages for Google to crawl.
This is a great question however remains unanswered on the ground that a basic Google search using,
ste:indeed.com
returns over 120MM results and secondly a query such as, "product manager new york" ranks #1 in results. These pages are obviously pre-generated which is confirmed by the fact the page is cached by the search engine (sometimes several days before) has different results from a live query on the site.
Easy when Googles search bot crawls the pages on indeed or any other job search site those page are dynamically created. Here is another site: http://jobuzu.co.uk i run this which is similar to how indeed works.
PHP is your friend in this and Indeed don't just use standard databases look into Sphinx and Solr as they offer Full text search for better performance then MySql etc.
They also make clever use of rel="canonical" and thorough internal linking:
http://www.indeed.com/find-jobs.jsp
Notice that all the pages that actually rank can be found from that direct internal link structure.
Is it possible to find out the total no of layouts (templates) used within a website.
For example:-
Suppose i want to know how many types of layouts www.flipkart.com uses.
Answer will be like:-
Landing page or Home page
Category Page e.g http://www.flipkart.com/mobiles?_l=GIuT6NCRsZbfL9ID9ZKHNQ--&_r=hCno5y6eFUI8C0iWzaQbAg--&ref=cef19a11-4ebc-4f8e-a0dc-401c2d55de3e&_pop=brdcrumb
This is a category page. All such pages will have same layout only the inner content will be different.
Product Pages like http://www.flipkart.com/htc-sensation-mobile-phone/p/itmczbrsnwphgbnw?pid=MOBCYW9HXBUDYJPH&_l=sXQjsX87GxqrvKzhjuOrkw--&_r=n_2yuAC4xgh0SZTuulvAtw--&ref=9305103f-6fc1-497c-807a-8f30ee30c13c is a product page.
All the product pages will have same layout like they have buy now option. Multiple images will be there. So Is there any existing tool to find out this.
I hope i am clear in my question. I just want to classify the site pages into some buckets.
Well I don't think there exists some kind of tool or algorithm now upto my knowledge but yes you can write some. Try to find out some attributes of these pages and set them as benchmarks. Now whenever you encounter a url and you want to identify its category just find out the attributes again and compare against the benchmarks set.
Its not generic though but will work for specific websites :)
I've seen sites like this (http://www.tradename.net/) on the web that seem to be nothing more than a collection of news articles pulled in from different places - all seemingly automated... I would like to know how can I create something like this that:
(a) either automatically, one its own pulls data from different news feeds and creates these articles/news-conent, OR
(b) I run a program periodically to update all its content
I am looking for a ready-to-run software / module that I can take and put in either the keywords or links to news feeds and get it to work... I'm not interested in one of those paid template sites.
Another example: http://www.limitedliability.org/
You can just make your own website like that. Just use rss-feeds from topics / newswebsites that you like to show your users. Customize your website like how you want it yourself using one of the scripting languages. It's not very hard to loop through all news flashes in a rss-feed and show them to your users.
You can use PHP
Or .NET
Or Javascript
Ans obviously there are more ways to do this. Just take a good look around and check with what scripting language you feel most comfortable.
Create a script that parses the rss-feeds from the news sites, and only store the ones that you are interested in.
Or just create your own Google News feed and add it to your site. There are free feeds for non-commercial use.
Available Google News Feeds
RSS Feeds: Incorporate feeds onto my site